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Background of the Invention 
This application is a continuation-in-part of U.S. Patent 
Application of Alizon et al. for "Cloned DNA Se- 

quences Related to the Entire Genomic RNA of Human 
Immunodeficiency Virus II (HIV-2), Polypeptides Encoded by these 
DNA Sequences and Use of these DNA Clones and Polypeptides in 
Diagnostic Kits," filed January 16, 1987, which is a 
continuation-in-part of U.S. Patent Application Serial No. 
931,866 filed November 21, 1986, which is a continuation-in-part 
application of U.S. Patent Application Serial No. 916,080 of-_ 
Montagnier et al. for "Cloned DNA Sequences Related to the 
Genomic RNA of the Human Immunodeficiency Virus II (HIV-2), Poly- 
peptides Encoded by these DNA Sequences and Use of these DNA 
Clones and Polypeptides in Diagnostic Kits," filed October 6, 
1986 and U.S. Patent Application Serial No. 835,228 of Montagnier 
et al « for "New Retrovirus Capable of Causing AIDS, Antigens 
Obtained from this Retrovirus and Corresponding Antibodies and 
their Application for Diagnostic Purposes," filed March 3, 198-6. 
The disclosures of each of these predecessor applications are 
expressly incorporated herein by reference. 

The invention relates to cloned DNA sequences analogous to 
the genomic RNA of a virus known as Lymphadenopathy-Associated 
Virus II ("LAV-II"), a process for the preparation of these 
cloned DNA sequences, and their use as probes in diagnostic kits. 
In one embodiment, the invention relates to a cloned DNA sequence 
analogous to the entire genomic RNA of HIV-2 and its use as a 
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probe. The invention also relates to polypeptides with amino 
acid sequences encoded by these cloned DNA sequences and the use 
of these polypeptides in diagnostic kits. 

According to recently adopted nomenclature, as reported in 
Nature, May 1986, a substant ially- ident ical group of retroviruses 
which has been identified as one causative agent of AIDS are now 
referred to as Human Immunodeficiency Viruses I (HIV-1). This 
previously-described group of retroviruses includes 
Lymphadenopathy-Associated Virus I (LAV-I), Human T-cell 
Lymphotropic Virus-Ill (HTLV-III) f and AIDS-Related Virus ( ARV ) . 

Lymphadenopathy-Associated Virus II has been described in 
United States Application Serial No. 835,228, which was filed 
March 3, 1986, and is specifically incorporated herein by refer- 
ence. Because LAV-II is a second , distinct causative agent of 
AIDS, LAV-II properly is classifiable as a Human Immunodeficiency 
Virus II (HIV-2). Therefore, "LAV-II" as used hereinafter 
describes a particular genus of HIV-2 isolates. 

While HIV-2 is related to HIV-1 by its morphology, its 
tropism and its in vitro cytopathic effect on CD 4 (T4) positive 
cell lines and lymphocytes, HIV-2 differs from previously 
described human retroviruses known to be responsible for AIDS. 
Moreover, the proteins of HIV-1 and 2 have different sizes and 
their serological cross-reactivity is restricted mostly to the 
major core protein, as the envelope glycoproteins of HIV-2 are 
not immune precipitated by HIV-l-pos i t i ve sera except in some 
cases where very faint cross-reactivity can be detected. Since a 
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significant proportion of the HIV infected patients lack 
antibodies to the major core protein of their infecting virus, it 
is important to include antigens to both HIV-1 and HIV-2 in an 
effective serum test for the diagnosis of the infection by these 
viruses. 

HIV-2 was first discovered in the course of serological re- 
search on patients native to Guinea-Bissau who exhibited clinical 
and immunological symptoms of AIDS and from whom sero-negat ive or 
weakly sero-pos i t ive reactions to tests using an HIV-1 lysate 
were obtained. Further clinical studies on these patients iso- 
lated viruses which were subsequently named "LAV-II." 

One LAV-II isolate, subsequently referred to as LAV-II MIR, 
was deposited at the Collection Nationale des Cultures de Micro- 
Organismes (CNCM) at the Institut Pasteur in Paris, France on 
December 19, 1985 under Accession No. 1-502 and has also been 
deposited at the British ECA CC under No. 87.001.001 on January 
9, 1987. A second LAV-II isolate was deposited at CNCM on 
February 21, 1986 under Accession No. 1-532 and has also been 
deposited at the British ECA CC under No. 87.001.002 on January 
9, 1987. This second isolate has been subsequently referred to 
as LAV-II ROD. Other isolates deposited at the CNCM on December 
19, 1986 are HIV-2 IRMO (No. 1-642) and HIV-2 EHO (No. 1-643). 
Several additional isolates have been obtained from West African 
patients, some of whom have AIDS, others with AIDS-related condi- 
tions and others with no AIDS symptoms. All of these viruses 
have been isolated on normal human lymphocyte cultures and some 
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of them were thereafter propagated on lymphoid tumor cell lines 
such as CEM and MOLT. 

Due to the sero-negat ive or weak sero-posit ive results 
obtained when using kits designed to identify HIV-1 infections in 
the diagnosis of these new patients with HIV-2 disease, it has 
been necessary to devise a new diagnostic kit capable of 
detecting HIV-2 infection, either by itself or in combination 
with an HIV-1 infection. The present inventors have, through the 
development of cloned DNA sequences analogous to at least a por- 
tion of the genomic RNA of LAV- 1 1 ROD viruses, created the mate- 
rials necessary for the development of such kits. 

Summary of the Invention 

As noted previously, the present invention relates to the 
cloned nucleotide sequences homologous or identical to at least a 
portion of the genomic RNA of HIV-2 viruses and to polypeptides 
encoded by the same. The present invention also relates to kits 
capable of diagnosing an HIV-2 infection. 

Thus, a main object of the present invention is to provide a 
kit capable of diagnosing an infection caused by the HIV-2 virus. 
This kit may operate by detecting at least a portion of the RNA 
genome of the HIV-2 virus or the provirus present in the infected 
cells through hybridization with a DNA probe or it may operate 
through the immunodiagnost ic detection of polypeptides unique to 
the HIV-2 virus. 

Additional objects and advantages of the present invention 
will be set forth in part in the description which follows, or 
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may be learned from practice of the invention. The objects and 
advantages may be realized and attained by means of the instru- 
mentaliti s and combinations particularly pointed out in the 
appended claims. 

To achieve these objects and in accordance with the purposes 
of the present invention, cloned DNA sequences related to the 
entire genomic RNA of the LAV-u virus are set forth. These se- 
quences are analogous specifically to the entire genome of the 
LAV- I I ROD strain. 

To further achieve the objects and in accordance with the 
purposes of the present invention, a kit capable of diagnosing an 
HIV-2 infection is described. This kit, in one embodiment, con- 
tains the cloned DNA sequences of this invention which are capa- 
ble of hybridizing to viral RNA or analogous DNA sequences to in- 
dicate the presence of an HIV-2 infection. Different diagnostic 
techniques can be used which include, but are not limited to: 
(1) Southern blot procedures to identify viral DNA which may or 
may not be digested with restriction enzymes; (2) Northern blot 
techniques to identify viral RNA extracted from cells; and 
(3) dot blot techniques, i.e., direct filtration of the sample 
through an ad hoc membrane such as nitrocellulose or nylon with- 
out previous separation on agarose gel. Suitable material for 
dot blot technique could be obtained from body fluids including, 
but not limited to, serum and plasma, supernatants from culture 
cells, or cytoplasmic extracts obtained after cell lysis and re- 
moval of membranes and nuclei of the cells by 
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ultra-centrif ugat ion as accomplished in the "CYTODOT" procedure 
as described in a booklet published by Schleicher and Schull. 

In an alternate embodiment, the kit contains the poly- 
peptides created using these cloned DNA sequences. These poly- 
peptides are capable of reacting with antibodies to the HIV-2 
virus present in sera of infected individuals, thus yielding an 
immunodiagnost ic complex . 

To further achieve the objects of the invention, a 
vaccinating agent is provided which comprises at least one 
peptide selected from the polypeptide expression products of the 
viral DNA in admixture with suitable carriers, adjuvents stabi- 
lizers . 

It is understood that both the foregoing general description 
and the following detailed description are exemplary and explana- 
tory only and are not restrictive of the invention as claimed. 
The accompanying drawings, which are incorporated in and consti- 
tute a part of the specification, illustrate one embodiment of 
the invention and, together with the description, serve to 
explain the principles of the invention. 

Brief Description of the Drawings 

Figure 1 generally depicts the nucleotide sequence of a 
cloned complementary DNA (cDNA) to the genomic RNA of HIV-2. 
Figure 1A depicts the genetic organization of HIV-1, position of 
the HIV-1 Hindi 1 1 fragment used as a probe to screen the cDNA li- 
brary, and restriction map of the HIV-2 cDNA clone, E2. 
Figure IB depicts the nucleotide sequence of the 3' end of HIV-2. 
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The corresponding region of the HIV-1 LTR was aligned using the 
Wilbur and Lipman algorithm (window: 10; K-tuple: 7; gap penalty: 
3) as described by Wilbur and Lipman in Proc. Natl. Acad. Sci. 
USA 80: 726-730 (1983), specifically incorporated herein by ref- 
erence. The U3-R junction in HIV-1 is indicated and the poly A 
addition signal and potential TATA promoter regions are boxed. 
In Figure IB, the symbols B, H, Ps and Pv refer to the restric- 
tion sites Bam HI , Hind i 1 1 , Pst I and Pvu l I , respectively. 

Figure 2 generally depicts the HIV-2 specificity of the E2 
clone. Figure 2A and B specifically depict a Southern blot of 
DNA extracted from CEM cells infected with the following iso- 
lates: HIV-2 R0D (a,c), HIV-2 DUL (b,d), and HIV-1 BRU (e,f). 
DNA in lanes a,b,f was Pst I digested; in c,d f e DNA was 
undigested. Figure 2C and D specifically depict dot blot hy- 
bridization of pelleted virions from CEM cells infected by the 
HIV-lg RU (l), Simian Immunodeficiency Virus (SIV) isolate Mm 
142-83 (3) r HIV-2 DUL (4), H IV-2 R0D (5), and HIV-1 ELI (6). 
Dot 2 is a pellet from an equivalent volume of supernatant from 
uninfected CEM. Thus, Figure 2A and C depicts hybridization with 
the HIV-2 cDNA (E2) and Figure 2B and D depicts hybridization to 
an HIV-1 probe consisting of a 9Kb Sac I insert from HIV-1 
BRU(clone lambda J 19). 

Figure 3 generally depicts a restriction map of the HIV-2 
ROD genome and its homology to HIV-1. Figure 3A specifically 
depicts the organization of three recombinant phage lambda 
clones, ROD 4, ROD 27, and ROD 35. In Figure 3A, the open boxes 



represent viral sequences, the LTR are filled, and the dotted 
box s represent cellular flanking sequences (not mapped). Only 
some characteristic restriction enzyme sites are indicated, 
XrOD 27 and /^ROD 35 are derived from integrated proviruses 
while ^ROD 4 is derived from a circular viral DNA. The portion 
of the lambda clones that hybridzes to the cDNA E2 is indicated 
below the maps. A restriction map of the ^ROD isolate was re- 
constructed from these three lambda clones. In this map, the re- 
striction sites are identified as follows: B: BamHI; E: Eco RI : 
H: Hin di 1 1 : K: Kpn l ; Ps : PstI; Pv: PvuII; S: Sac I : X: Xba l . 
R and L are the right and left Bam HI arms of the lambda L47.1 
vector. 

Figure 3B specifically depicts dots 1-11 which correspond to 
the single-stranded DNA form of M13 subclones from the HIV-Ibru 
cloned genome ( ^J19). Their size and position on the HIV-1 
genome, determined by sequencing is shown below the figure. 
Dot 12 is a control containing lambda phage DNA. The dot-blot 
was hybridized in low stringency conditions as described in 
Example 1 with the complete lambda ^yROD 4 clone as a probe, and 
successively washed in 2x SSC, 0.1% SDS at 25°C. (Tm -42°C) , Ix 
SSC, 0.1% SDS at 60°C. (Tm -20°C), and O.lx SSC, 0.1% SDS at 
60°C. (Tm -3°C) and exposed overnight. A duplicate dot blot was 
hybridized and washed in stringent conditions (as described in 
Example 2) with the labelled lambda J19 clone carrying the com- 
plete HIV-Iq^u genome. HIV-1 and HIV-2 probes were labelled the 
same specific activity (10** cpm/ g.). 
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Figure 4 generally depicts the restriction map polymorphism 
in different HIV-2 isolates and shows comparison of HIV-2 to SIV. 
Figure 4A specifically depicts DNA (20 ug. per lane) from CEM 
cells infected by the isolate HIV-2 DUL (panel 1) or peripheral 
blood lymphocytes (PBL) infected by the isolates HIV-2 GO m 
(panel 2) and HIV-2 MIR (panel 3) digested with: EcoRI (a), PstI 
(b), and Hind i 1 1 (c). Much less viral DNA was obtained with 
HIV-2 isolates propagated on PBL. Hybridization and washing were 
in stringent conditions, as described in Example 2, with 10 6 
cpm/ml. of each of the E2 insert (cDNA) and the 5 kb. Hindi 1 1 
fragment of Xrod 4, labelled to 10 9 cpm/ug. 

Figure 4B specifically depicts DNA from HUT 78 (a human T 
lymphoid cell line) cells infected with STLV3 MAC isolate Mm 
142-83. The same amounts of DNA and enzymes were used as indi- 
cated in panel A. Hybridization was performed with the same 
probe as in A, but in non-stringent conditions. As described in 
Example 1 washing was for one hour in 2x SSC, 0.1% SDS at 40°C 
(panel 1) and after exposure, the same filter was re-washed in 
O.lx SSC/ 0.1% SDS at 60°C. (panel 2). The autoradiographs were 
obtained after overnight exposition with intensifying screens. 
Figure 5 depicts the position of derived plasmids 



ferred embodiments of the invention, which, together with the 
following examples, serve to explain the principles of the 
invention. 
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The genetic structure of the HIV-2 virus has been analyzed 
by molecular cloning according to the method set forth herein and 
in the Examples. A restriction map of the genome of this virus 
is included in Figure 4. In addition, the partial sequence of a 
cDNA complementary to the genomic RNA of the virus has been 
determined. This cDNA sequence information is included in 
Figure 1. 

Also contained herein is data describing the molecular 
cloning of the complete 9.5 kb genome of HIV-2, data describing 
the observation of restriction map polymorphism between different 
isolates, and an analysis of the relationship between HIV-2 and 
other human and simian retroviruses. From the totality of these 
data, diagnostic probes can be discerned and prepared. 

Generally, to practice one embodiment of the present inven- 
tion, a series of filter hybridizations of the HIV-2 RNA genome 
with probes derived from the complete cloned HIV-1 genome and 
from the gag and qoI genes were conducted. These hybridizations 
yielded only extremely weak signals even in conditions of very 
low stringency of hybrization and washing. Thus, it was found to 
be difficult to assess the amount of HIV-2 viral and proviral DNA 
in infected cells by Southern blot techniques. 

Therefore, a complementary DNA (cDNA) to the HIV-2 genomic 
RNA initially was cloned in order to provide a specific hy- 
bridization probe. To construct this cDNA , an oligo (dT) primed 
cDNA first-strand was made in a detergent-activated endogenous 
reaction using HIV-2 reverse transcriptase with virions purified 
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from sup rnatants of infected CEM cells. The CEM cell line is a 
lymphoblastoid CD4+ cell line described by G.E. Foley et al. in 
Cancer 18: 522-529 (1965) , specifically incorporated herein by 
reference. The CEM cells used were infected with the isolate ROD 
and were continuously producing high amounts of HIV-2. 

After second-strand synthesis, the cDNAs were inserted into 
the M 13 tg 130 bacteriophage vector. A collection of 10 4 M13 
recombinant phages was obtained and screened in situ with an 
HIV-1 probe spanning 1.5 kb. of the 3* end of the LAV BRU isolate 
(depicted in Figure 1A) . Some 50 positive plaques were detected, 
purified, and characterized by end sequencing and cross- 
hybridizing the inserts. This procedure is described in more 
detail in Example 1 and in Figure 1. 

The different clones were found to be complementary to the 
3' end of a polyadenylated RNA having the A AT AAA signal about 20 
nucleotides upstream of the poly A tail, as found in the long 
terminal repeat (LTR) of HIV-1. The LTR region of HIV-1 has been 
described by S. Wain Hobson et al. in Ceil 40: 9-17 (1985), spe^ 
cifically incorporated herein by reference. The portion of the 
HIV-2 LTR that was sequenced was related only distantly to the 
homologous domain in HIV-1 as demonstrated in Figure 1 B. In- 
deed, only about 50% of the nucleotides could be aligned and 
about a hundred insertions/deletions need to be introduced. In 
comparison, the homology of the corresponding domains in HIV-1 
isolates from USA and Africa is greater than 95% and no inser- 
tions or deletions are seen. 
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The largest insert of this group of M13 clones was a 2 kb. 
clone designated E2 . Clone E2 was used as a probe to demonstrate 
its HIV-2 specificity in a series of filter hybridization experi- 
ments. Firstly, this probe could detect the genomic RNA of HIV-2 
but not HIV-1 in stringent conditions as shown in Figure 2, C and 
D. Secondly, positive signals were detected in Southern blots of 
DNA from cells infected with the ROD isolate as well as other 
isolates of HIV-2 as shown in Figure 2, A and Figure 4 , A. No 
signal was detected with DNA from uninfected cells or HIV-1 in- 
fected cells, confirming the exogenous nature of HIV-2. In 
undigested DNA from HIV-2 infected cells, an approximately 10 kb, 
species, probably corresponding to linear unintegrated viral DNA, 
was principally detected along with a species with an apparent 
size of 6 kb. , likely to be the circular form of the viral DNA. 
Conversely, rehybr idizat ion of the same filter with an HIV-1 
probe under stringent conditions showed hybridization to HIV-1 
infected cells only as depicted in Figure 2, B. 

To isolate the remainder of the genome of HIV-2, a genomic 
library in lambda phage L47.1 was constructed. Lambda phage 
L47.1 has been described by W.A.M. Loenen e t al . in Gene 10 ; 
249-259 (1980), specifically incorporated herein by reference. 
The genomic library was constructed with a partial Sau 3AI re- 
striction digest of the DNA from the CEM cell line infected with 
HIV-2 R0D . 

About 2 X 10^ recombinant plaques were screened in situ with 
labelled insert from the E2 cDNA clone. , Ten recombinant phages 
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were detected and plaque purified. Of these phages, three were 
characterized by restriction mapping and South rn blot hy- 
bridization with the E2 insert and probes from its 3* end (LTR) 
or 5' end (envelope), as well as with HIV-1 subgenomic probes. 
In this instance, HIV-1 probes were used under non-stringent con- 
ditions. 

A clone carrying a 9.5 kb. insert and derived from a circu- 
lar viral DNA was identified as containing the complete genome 
and designated ^ROD 4. Two other clones, ^ROD 27 and ^ROD 35 
were derived from integrated proviruses and found to carry an LTR 
and cellular flanking sequences and a portion of the viral coding 
sequences as shown in Figure 3, A. 

Fragments of the lambda clones were subcloned into a plasmid 
vector p UC 18. 



Plasmid pROD 27-5' is derived from f\ROD 27 and contains the 
5 1 2Kb of the HIV-2 genome and cellular flanking sequences (5' 
LTR and 5 f viral coding sequences to the Eco RI site) 



Plasmid p ROD 4-8 is dervied from ^ ROD 4 and contains the 
about 5Kb Hindlll fragment that is the central part of the HIV-2 
genome. 

Plasmid pROD 27-5' and p ROD 4.8 inserts overlap. 

Plasmid pROD 4.7 contains a Hind i 1 1 1.8 Kb fragment from 
XrOD 4. This fragment is located 3' to the fragment subcloned 
into pROD 4.8 and contains about 0.8 Kb of viral coding sequences 
and the part of the lambda phage ( y>L$7.1) left arm located 
between the Bam Hl and Hind i 1 1 cloning sites. 
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Plasmid pROD 35 contains all the HIV-2 coding sequences 3' 
to th Eco RI site, the 3' LTR and about 4 Kb of cellular flanking 
sequences. 

Plasmid pROD 27-5' and pROD 35 in E . coli strain HB 101 are 
deposited respectively under No. 1-626 and 1-633 at the CNCM, and 
have also been deposited at the NCIB (British Collection). These 
plasmids are depicted in Figure 5. Plasmids pROD 4-7 and pROD 
4-8 in E . coli strain TGI are deposited respectively under 
No. 1-627 and 1-628 at the CNCM. 

To reconstitute the complete HIV-2 ROD genome, pROD 35 is 
linearized with Eco RI and the Eco RI insert of pROD 27-5' is 
ligated in the correct orientation into this site. 

The relationship of HIV-2 to other human and simian ret- 
roviruses was surmised from hybridization experiments. The rela- 
tive homology of the different regions of the HIV-1 and 2 genomes 
was determined by hybridization of fragments of the cloned HIV-1 
genome with the labelled ^\ROD 4 expected to contain the complete 
HIV-2 genome (Figure 3, B) . Even in very low stringency condi- 
tions (Tm-42°C), the hybridization of HIV-1 and 2 was restricted 
to a fraction of their genomes, principally the gag gene (dots 1 
and 2), the reverse transcriptase domain in pol (dot 3), the end 
of pol and the Q (or sor ) genes (dot 5) and the F gene (or 3' 
orf) and 3 1 LTR (dot 11). The HIV-1 fragment used to detect the 
HIV-2 cDNA clones contained the dot 11 subclone, which hybridized 
well to HIV-2 under non-stringent conditions. Only the signal 
from dot 5 persisted after stringent washing. The envelope gene, 
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the region of the tat g ne and a part of pol thus seemed very 
divergent. These data, along with the LTR sequence obtained 
(Figur 1, B) , indicated that HIV-2 is not an envelope variant of 
HIV-1, as are African isolates from Zaire described by Alizon et 
al.. Cell 40 :63-74 (1986). 

It was observed that HIV-2 is related more closely to the 
Simian Immunodeficiency Virus (SIV) than it is to HIV-1. This 
correlation has been described by F. Clavel et al. in C.R. Acad. 
Sci. (Paris) 302: 485-488 (1986) and F. Clavel et al . in Science 
233: 343-346 (1986), both of which are specifically incorporated 
herein by reference. Simian Immunodeficiency Virus (also desig- 
nated Simian T-cell Lymphotropic Virus Type 3 f STLV-3) is a ret- 
rovirus first isolated from captive macaques with an AIDS-like 
disease in the USA. This simian virus has been described by M.D. 
Daniel et al. in Science 228 : 1201-1204 (1985), specifically in- 
corporated herein by reference. 

All the SIV proteins, including the envelope, are immune 
precipitated by sera from HIV-2 infected patients, whereas the 
serological cross-react ivity .of HIV-1 to 2 is "restricted to the 
core proteins. However SIV and HIV-2 can be distinguished by 
slight differences in the apparent molecular weight of their pro- 
teins . 

In terms of nucleotide sequence, it also appears that HIV-2 
is closely related to SIV. The genomic RNA of SIV can be 
detected in stringent conditions as shown in Figure 2, C by HIV-2 
probes corresponding to the LTR and 3 f end of the genome (E2) or 
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to the gag or pol genes. Under the same conditions, HIV-1 
derived probes do not detect the SIV genome as shown in Figure 2, 
D. 

In Southern blots of DNA from SIV-infected cells, a restric- 
tion pattern clearly different from HIV-2rqd anc * other isolates 
is seen. All the bands persist after a stringent washing, even 
though the signal is considerably weakened, indicating a sequence 
homology throughout the genomes of HIV-2 and SIV. It has re- 
cently been shown that baboons and macaques could be infected 
experimentally by HIV-2, thereby providing an interesting animal 
model for the study of the HIV infection and its preventive ther- 
apy. Indeed, attempts to infect non-human primates with HIV-1 
have been successful only in chimpanzees, which are not a conve- 
nient model. 

From an initial survey of the restriction maps for certain 
of the HIV-2 isolates obtained according to the methods described 
herein, it is already apparent that HIV-2, like HIV-1, undergoes 
restriction site polymorphism. Figure 4 A depicts examples of* 
such differences for three isolates, all different one from 
another and from the cloned HIV-2rqd* lt ^ s very likely that 
these differences at the nucleotide level are accompanied by 
variations in the amino-acid sequence of the viral proteins, as 
evidenced in the case of HIV-1 and described by M. Alizon e t a 1 . 
in Cell 46: 63-74 (1986), specifically incorporated herein by 
reference. It is also to be expected that the various isolates 
of HIV-2 will exhibit amino acid heterogeneities. See, for 
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example, Clavel et al . , Nature 324 (18):691-695 (1986), specifi- 
cally incorporated herein by reference. 

Furth r, the chacter izat ion of HIV-2 will also delineate the 
domain of the envelope glycoprotein that is responsible for the 
binding of the surface of the target cells and the subsequent in- 
ternalization of the virus. This interaction was shown to be me- 
diated by the CD 4 molecule itself in the case of HIV-1 and simi- 
lar studies tend to indicate that HIV-2 uses the same receptor. 
Thus, although there is wide divergence between the env genes of 
HIV-1 and 2, small homologous domains of the envelopes of the two 
HIV could represent a candidate receptor binding site. This site 
could be used to raise a protective immune response against this 
group of retroviruses. 

From the data discussed herein, certain nucleotide sequences 
have been identified which are capable of being used as probes in 
diagnostic methods to obtain the immunological reagents necessary 
to diagnose an HIV-2 infection. In particular, these sequences 
may be used as probes in hybridization reactions with the genetic 
material of infected patients to indicate whether the RNA of the 
HIV-2 virus is present in t hese { pa t i ent ' s lymphocytes or whether 
an analogous DNA is present. In this embodiment, the test meth- 
ods which may be utilized include Northern blots, Southern blots 
and dot blots. One particular nucleotide sequence which may be 
useful as a probe is the combination of the 5 kb. Hin di 1 1 frag- 
ment of ROD 4 and the E2 cDNA used in Figure 4. 
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In addition, the genetic sequences of the HIV-2 virus may be 
used t cr ate the polypeptides encoded by these sequences. Spe- 
cifically, these polypeptides may be created by expression of the 
cDNA obtained according to the teachings herein in hosts such as 
bacteria, yeast or animal cells* These polypeptides may be used 
in diagnostic tests such as immunofluorescence assays ( I FA) , ra- 
dioimmunoassays (RIA) and Western Blot tests. 

Moreover, it is also contemplated that additional diagnostic 
tests, including additional immunodiagnost ic tests, may be 
developed in which the DNA probes or the polypeptides of this 
invention may serve as one of the diagnostic reagents. The 
invention described herein includes these additional test meth- 
ods . 

In addition, monoclonal antibodies to these polypeptides or 
fragments thereof may be created. The monoclonal antibodies may 
be used in immunod iagnos t i c tests in an analogous manner as the 
polypeptides described above. 

The polypeptides of the present invention may also be used 
as immunogenic reagents to induce protection against infection by 
HIV-2 viruses. In this embodiment, the polypeptides produced by 
recombinant-DNA techniques would function as vaccine agents. 

Also, the polypeptides of this invention may be used in com- 
petitive assays to test the ability of various antiviral agents 
to determine their ability to prevent the virus from fixing on 
its target. 
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Thus, it is to be understood that application of the teach- 
ings of the pres nt inv ntion to a specific problem or environ- 
ment will b within the capabilities of one having ordinary skill 
in the art in light of the teachings contained herein. Examples 
of the products of the present invention and representative pro- 
cesses for their isolation and manufacture appear above and in 
the following examples, 

EXAMPLES 

Example 1 : Cloning of a cDNA Complementary to 

Genomic RNA From HIV-2 Virions 

HIV-2 virions were purified from 5 liters of supernatant 
from a culture of the CEM cell line infected with the ROD isolate 
and a cDNA first strand using oligo (dT) primer was synthesized 
in detergent activated endogenous reaction on pelleted virus, as 
described by M. Alizon et al . in Nature, 312 ; 757-760 (1984), 
specifically incorporated herein by reference, RNA-cDNA hybrids 
were purified by phenol-chloroform extraction and ethanol precip- 
itation. The second-strand cDNA was created by the DNA 
polymerase I/RNAase H method of Gubler and Hoffman in Gene, 25 : 
263-269 (1983), spec if ically incorporated herein by reference, 
using a commercial cDNA synthesis kit obtained from Amersham. 
After attachment of Eco RI linkers (obtained from Pharmacia), 
EcoRI digestion, and ligation into EcoRI -d iges ted 
dephosphorylated M13 tg 130 vector (obtained from Amersham), a 
cDNA library was obtained by transformation of the E. coli TGI 
strain. Recombinant plaques (10 4 ) were screened in situ on rep- 
lica filters with the 1.5 kb. Hin di 1 1 fragment from clone J19, 



-19- 



• 



corresponding to the 3' part of the genome of the LAVbru isolate 
of HIV-1, ^ 2 P labelled to a specific activity of 10 9 cpm ug. The 
filters were prehybr idized in 5 x SSC, 5 x Denhardt solution, 25% 
formamide, and denatured salmon sperm DNA (100 ug/ ml.) at 37 °C . 
for 4 hours and hybridized for 16 hours in the same buffer (Tm 
-42°C.) plus 4 x 10 7 cpm of the labelled probe (10 s cpm/ml. of 
hybridization buffer). The washing was done in 5 x SSC, 0.1% SDS 
at 25°C. for 2 hours. 20 x SSC is 3M NaCI, 0 . 3M Na citrate. 
Positive plaques were purified and single-stranded M13 DNA pre- 
pared and end-sequenced according to the method described in 
Proc. Nat* L Acad. Sci. USA, 74: 5463-5467 (1977) of Sanger et 



DNA was extracted from infected CEM cells continuously pro- 
ducing HIV-1 or 2. The DNA digested with 20 ug of Pst I digested 
with or undigested, was elect rophoresed on a 0.8% agarose gel, 
and Southern-transferred to nylon membrane. Virion dot-blots 
were prepared in duplicate, as described by F. Clavel et al . in 
Science 233 ; 343-346 (1986), specifically incorporated herein by 
reference, by pelleting volumes of supernatant corresponding to 
the same amount of reverse transcriptase activity. 

Prehybr idizat ion was done in 50% formamide, 5 x SSC, 5 x Denhardt 
solution, and 100 mg./ml. denatured salmon sperm DNA for 4 hours 
at 42°C. Hybridization was performed in the same buffer plus 10% 
Dextran sulphate, and 10^ cpm/ml. of the labelled E2 insert 




Example 2 ; 



al. 



Hybridization of DNA from HIV-1 and 
HIV-2 Infected Cells and RNA from HIV-1 
and 2 and SIV Virons With a Probe 
Derived From an HIV-2 Cloned cDNA 
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(specific activity 10 y cpm/ug.) for 16 hours at 42°C. Washing 

was in 0.1 x SSC, 0.1% SDS for 2 x 30 mn. After exposition for 

16 hours with intensifying screens, the Southern blot was 

dehybridized in 0.4 N NaOH, neutralized, and rehybridized in the 

same conditions to the HIV-1 probe labelled to 10^ cpm/ug. 

Example 3 : Cloning in Lambda Phage of the 

Complete Provirus DNA of HIV-2 

DNA from the HIV-2 RO d infected CEM (Figure 2, lanes a and c) 
was partially digested with Sau3AI. The 9-15 kb. fraction was 
selected on a 5-40% sucrose gradient and ligated to BamHI arms of 
the lambda L47.1 vector. Plaques (2 x 10 6 ) obtained after in 
vitro packaging and plating on E. coli LA 101 strain were 
screened in situ with the insert from the E2 cDNA clone. Approx- 
imately 10 positive clones were plaque purified and propagated on 
E. coli C600 recBC. The ROD 4, 27, and 35 clones were ampli- 
fied and their DNA characterized by restriction mapping and 
Southern blotting with the HIV-2 cDNA clone under stringent con- 
ditions, and qaq-pol probes from HIV-1 used under non stringent 
conditions. 
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Example 4 : 



Complete Genomic Sequence of 
the ROD HIV-2 Isolate 



Experimental analysis of the HIV-2 ROD isolate yielded the 
following sequence which represents the complete genome of this 
HIV-2 isolate. Genes and major expression products identified 
within the following sequence are indicated by nucleotides num- 
bered below: 

I) GAG gene (546-2111) expresses a protein product having 
a molecular weight of around 55Kd and is cleaved into the follow- 
ing proteins: 

;;;{ a) p 16 (546-950) 

J:j b) p 26 (951-1640) 

III? c) p 12 (1701-2111) 

|"J 2) polymerase (1829-4936) 

!'. 3) Q protein (4869-5513) 

f ;H 4) R protein ( 5682-5996) 

'At 5) X protein ( 5344-5679) 

M 6) Y protein (5682-5996) 

7) Env protein (6147-8720) 

8) F protein (8557-9324) 

9) TAT gene (5845-6140 and 8307-8400) is expressed by two 
exons separated by introns. 

10) ART protein (6071-6140 and 8307-8536) is similarly the 
expression product of two exons. 

II) LTR : R (1-173 and 9498-9671) 



-22- 



12) U5 (174-299) 
-13) U3 (8942-9497) 

It will be known to one of skill in the art that the 
absolute numbering which has been adopted is not essential. For 
example, the nucleotide within the LTR which is designated as "1" 
is a somewhat arbitrary choice. What is important is the se- 
quence information provided. 
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# 



CGICqCTCTCCCCACAGGCTCGCACATTCACCCCTGCCACCTTCICTCCAGCACTACCAC 

gtagagcctcggtgttccctgctagactctcaccagcacttgcccggtgctgggcagacc 

GCCCCACGCTTGCTTGCTTAAAAACCTCTTAATAAAGCTGCCACTTAGAAGCAAGTTAAG 

TGTGTGCTCCCATCTCTCCTAGTCGCCCCCTGGTCATTCGGTGTTCACCTGACTAACAAG 

200 • * ' ■ * 

ACCCTGGTCTCTTAGGACCCTTCTTGCTTIGCGAAACCGAGCCAGGAAAATCCCtAGCAG 

GTTGGCGCCTCAACAGGGACTTCAAGAAGACTGAGAAGTCTTGGAACACG0CTCACT6AA 

CGCAGTAAGCGCGCCACGAACAAACCACCACGCAGTGCTCCtAGAAACOCGCGOGCCGAG 

GTACCAAAGGCAGCGTCTGCACCCGCAGGAGAAGACGCCTCCCGGTGAACGTAAGTACCT 

"UCACCAAAAACTGTAGCCGAAACGCCTTGCTATCCTACCTTTAGACACGTAGAAGATTGT 

!!! MetclyAl*Ar t A5oS«rVtlL«uArsGlyLy.Li»Al«A«pGiuLeuGluArgIl« 
!il GGGAGATCGGCGCGAGAAACTCCCTCTTGAGAGGGAAAAAACCAGAXGAATTAGAAAGAA 

U Ar 8 LeuArgProGlyGlyLy«LytL,iTyrArgL«uUy«Hi«IleV.lTrjAljAl.A.« 
^ TCAGGTTACCCCCCGCCCGAAAGAAAAAGTACACGCTAAAACATATTGTCTGGGCAGCCA 

d • 

Lv.L«uAiDAraPh«GiiL«uAUGluS«rL«uLettGlu6«rLy«GluGlyCy«GlnLy» 

- AlilA^wIJiSAi^GGATTACCAGAGAGCCTGTTGGAGTCAAAACAGGCTTGtCAAA 

7 0 v • * 

•I IleL«uThrV.lL.uA.pProM.tV.Lp\oThrGlyS.rGluA.^ 
(I AAATTCTTACAGTTTTAGATCCAATCGTACCGACAGCTTCAGAAAATTTAAAAACTCTTT 

;! A.nThrvilCy.V.lIliTrpCy.ll.Hi.AUGluciuLy.;^ 
" TTAATACTGTCTGCGTCAITTGGTGCAIACACGCAGAACAGAAAGTCAAACATACTGAAG 

" ""tOO • . * 

Al.Ly.clnlleV.iArgArgHi.LeuV.lAlaGluThrG^ 
CAGCAAAACAAATACTGCGCAGACATCTAGTCCCAGAAACAGGAACTGCAGAGAAAATGC 

S«rThr8«rArgProThrAl«ProS«r8«rGluLy«GlyClyA«ttTyrProV*lGioHi« 

caaSJacIISJaJacSIacIgcaccatctaccgacaaccgagcaaattacccagtg 

V«lGlyGlyA«nTyrThrHi8lleProL«uS«rProArgThrLtuAanAl«TrpV»ll.y» 

atgJaggcggcaactacacccatataccgctcactccccgaaccctaaatgcctcgctaa 

1000 

L.uVAlGiuGiuLy.LysPhtClyAl.GiuV.lV.lProGiyPheGlnAiaLtuSerGlu 

aattagtagaggaaaaaaagttcgggccacaagtactgccacgatttcacgcactctcao 
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JJtfCCTCCACCCCCTATCATAlCAACG ^ . ^ 

CC*tOC*e*t**TC46CC4CAttATC* ....^.H 1.A1.G1» 

CAATACCACGCCCCtTACCAeCCGGG .,.,«.Ol«A..»««»«»" 

CCACAACAAGCACACTAGAAGAACACATC ^ „. ..cviv.JAr.M.tlyr 

CACTAGGAAACATCTATAGAAGATGGA ^ lu p to phicin3«rTyrV« 1 

A..P,.ti t A..XUW:»^i:i i - ASS6 lUAAlSoi«cr 6 UScAAA e CXAXO 

ACAACCCGACCAACATCCTAGACATAAA ....LV.iw.A.."*"" 

TAGAIAGATTCTACAAAAGCTTGAG ^ v . L « tt *. U.uLy.C iyL.u 

TGACCCAAACACTGCTAGTACAAAA ^ fi , ftG iy V^GlyQly**" 01 * 

TACCGATGAACCCTACCTTAGAACAGA '«! , . m „l y l«;»W»«.;» 

GCCAGAAAGCTACATTAATCGCAGA^ . r*.Clil»»01u«ljHl» 

^AWaUm.GU^"^ 

CATTCGCACCAGCCCACCAGAGAAAGGCA ,„,.„C,;«lyl»"«'" 

AC „GGCAACAGAA<GCCGAGCAC ^ ^^^^^ , ,,,, 

CACACATCATGACAAACTGCCCAGATAGAC «„A...t«T». 

GAAAGAACCCCCGCAACTTCCCCGTGGCv ^ . Al4Ara Gl»Ly.ThrClu 

CCCCAGtCGATCCAGCAGTGGATCTACTGGA A1 .; r oAr»Ai.Giy 

GluThtPtotyrArgClaProPr ThrG CTTCCTCC ACCTCAATTCTCTCT 
GGGAGACACCATACAGGGAGCCACCAAt. ^ 



Ly«Art?ro?«l?«lThrAl*TyrIl«CluClyainPr ?«lGWV«lL«uL«uA«pTbr 

GAAAAcIcCACIACTCACACCATACAtTGACCCTCACCCACTACAAOTCTTCTIACACAC 

ClyAl«A«pA»p8«rIleValAlAClyIl«CiuLeu01yA8nA»uIyr8«rProLy»Xle 
AGGGGCTGACGACTCAATAGTAGCAGGAATAGAGTTAGCGAACAATTATACCCCAAAAAT 

2200 

V«lGlyGlyIleClyGlyPheIl«AioThriy«CluTyrLy»A«nV«lGluXUCluV«l 
AGTAGCGGGAATAGGGGGATTCATAAAIACCAAGGAATATAAAAAIGTACAAATACAAGT 

L«uA»oLy#Ly« VmlArgAl«ThrIl«MicThrClyA«pThrProIl«A«oIl«Ph«Cl7 
TCTAAATAAAAAGGTACCGGCCACCATAATGACAGGCGACACCCCAATCAACATTTTTGG 
. 2300 . • . . 

ArsA»nIleL«uThrAl«LeuG lytu 1 8«r L*uA«nLeuPro Va 1A1 «Ly s Va 1G LuPro 
CAOAAATATTCTGACAGCCTXAGUCATGTCATTAAATCTACCAGTCGCCAAACTACAGCC 

2400 

Il«Ly«Il*M«tL«uLy»ProG!yLy«A»pGlyProLy«L«uArgCloTrpProL«uThr 
AATAAAAAIAATGCTAAAGCCACCCAAAGATOCACCAAAACTCAGACAATOOCCCTTAAC 

• • • * 

Ly.CluLyilUGluAial.uLy.CtuIltCyiGluLy.M.tClttL^ 
AAAACAAAAAATAGAAGCACTAAAAGAAATCTOTCAAAAAATGCAAAAAOAACCCCAGCT 

2300 

GluGluAl*ProProThrAtnProTyTA«nThrProIhrPbeAl«H«Ly«LyiLyiiA#p 
AGAGGAAGCACCTCCAACTAATCCTTATAATACCCCCACATTTGCAATCAA6AAAAACGA 

• 

Ly8AinLy»XrpArgM«tL«uIl«A»pPh«Ar»CluL«uA«nLy»V«lTbr0lnA«pPh* 
CAAAAACAAAIGGAGGATGCTAATACATTTCAGACAACTAAACAAGGTAACTCAACATTT 

2600 ...» 
TbrGluIl«GlnLeuGlyIl«ProHi«PrpAl«GlyLeuAl«Ly«Ly»ArgArtIl«Thr 

CACACAAATTCAGITAGGAATTCCACACCCAGCAGGGTTCCCCAAGAAGACAAOAATTAC 

. . . • • 2700 

V«lLeuA«pV«lGlyAfpAl«TyrPh«5«rIl«ProLeuHi»GloA*pPh«ArgProTyr 

TCTACTAGATGtAGGGGATGCTTACTTTTCCATACCACTACATGACGACTTTAGACCATA 

ThrAl*PheThrL«uProS«rV«lA»oA«nAi*CluProGlyLy«ArgTyrIl«TyrLy8. 
TACTGCATTTACTCTACCATCAGTGAACAATGCAGAACCAGGAAAAAGATACATATATAA 

2800 

ValLeuProGloGlyTrpLytC ly S«r P roAlal lePh«G InH i«TbrM« t ArgC InVa 1 
AGTCTTGCC ACAGGGATGGAAGGGATCACCACCAATTTTTCAAC ACACAATGAGACAGGT 

l«ttGluP*0Ph«ArgLy«Al«AinLy8A«pV«Hl«Il«Il«ClBTyrM«tA»pA«pIl« 

ATTAGAACCATTCAGAAAAGCAAACAAGGATGTCATTATCATTCAGTACATGGATGATAT 

2900 . . • . * 

L«uIl«Al«SerA«pArgThrA«pL«uGluHiiA«pArgV«lV»lL«uGlnL«ttLyiGlu 
CTTAATAOCTAGIGACAGGACAGATTTAOAACATGATAGGGTAGTCCTGCAGCTCAAGCA 
. . * . . - Jooo 

L«uLeuA«nGlyL«uGlyPb«8«rTbrProA«pGluLy«Pb«GlnLy»A»pProProTyr 
ACTTCTAAATGGCC TAGGATTTTCTACCCCAGATGACAAGTTCCAAAAAGACCCTCCATA 
. . • • • * 

Hi«TrpM«tClyTyrG luLeuTrpProThr Ly iTrpLyi L«uC InLy • X leGlnLauPro 
CCACTGGAIGCCCTATGAACTATCGCCAACTAAATGGAAGTTGCAGAAAATACAGTTCCC 

3100 

GlnLy»CluIl«TrpTbrV«lAtnA«pIl«GlnLy8L«uV«lClyV«ll«uA»nTrpAl« 

CCAAAAAGAAATATCCACAGTCAATCACAICCACAAGCTAGTGCGTGICCTAAATTGGGC 
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Al«ClnL«uTyrProClyI L«Ly«ThrLyiHttL«uCy«ArgL«uIl«ArgClyLy«M«t 
AGCACAACTCTACCCAGGGATAAAGACCAAACACTTATCTACCTTAAf CAGAGCAAAAAT 

3200 .... 
ThrL«aIhrCiuCluV«lClnTrpThrCluL«uAl*CluAl»GluLeuCluCluAenArg 
CACACTCACAOAACAACTACACTCCACACAATTACCACAACCAOACCTAOAAOAAAACAC 

3300 

Xl«Xl«L«»8«tClnCluCinCluClyHi«IyrT yrGlnGluCluLy§CluL«uGluAl« 
AATTATCCTAACCCAGGAACAACAGGCACACTATTACCAAGAAGAAAAAGACCTACAACC 

ThrV«lGlnLy*A«pGlnGiuAinGLnTrpThrTyrLy»Il«Hi»ClnCluC luLyille 
AACAGTCCAAAAGCA7CAAGAGAATCAG7GGACATATAAAATACACCAGGAAGAAAAAAT 

3400 

L«vLy«?«lGlyLysTyrAlaLysV«lLy«AsnThrBitTbrA«aGlyZl«Arf L«uL«u 
TCTAAAAGTAGCAAAATATCCAAAGGTGAAAAACACCCATACCAATGGAATCAGATTGTT 

..*••• 
Al«GlnVmlV«lGloLy«Il«GlyLy»GluAl*LeuV«lIl«TrpGlyArgIl«ProLy« 

AGCACACGTAGTTCAGAAAATACGAAAAGAAGCACTAG7CATTTGGGGACCAATACCAAA 

3500 . • 

Ph«Hi«L«uProV«lGluArgGluIl«TrpGluGlnTrpTrpA«pAsaTyrTrpGlaV«l 
ATTTCACCTACCAGTAGAGAGACAAATCTG6GAGCAGTGCTGG6ATAACTACT0GCAA0T 

, • • • JOOW 

ThrTroIleProAipTrpA«pPh«V«li«rThrf roProL«uV«lArgL«aAl*Ph«Aia 

GA^ATG^ATCCC^GACTG^GACTTCGTCTCTACCCCACCACTGCTCAGGTTAOCGTTTAA 

L«uV«lGlyA8pProZl«ProClyAlaG luThr Ph«TyrThr AspGly SorCy a AanArg 
CCTGGTAGGGGATCCTATACCAGGTGCAGAGACCTTCTACACAGATGGATCCTGCAATAG 

3700 

ClaS«rLysG luG lyLy* Al*GlyTyrVaLThrAipArgGlyLysA*pLy«V«lLy»Ly» 
CCAATCAAAAGAAGGAAAAGCACGATATCTAACAGATAGAGGGAAAGACAAGGTAAACAA 

LeuGiuGlnThrThrAtnG InG InAlaG luL«uC luAlaPheAl«MetAl«L«uTbrA«p 
ACTAGAGCAAACTACCAATCAGCAAGCAGAACTAGAACCCTTTCCGAICCCACTAACAGA 

3800 ...» 
SerClyProLy»Va lAan I lei 1«V» lAipS trG InTy r Va IMa tG ly I laSerAlaSar 
CTCGGGTCCAAAAGTTAATATTATACTAGACTCACAGTATGTAATGGGCATCAGTGCAAG 

3900 

GlnProThrGlu8arGluSerLyaIl«ValAanGinIleIlaGluGluMecIleI.yaLya 
CCAACCAACAGAGTCACAAAGTAAAATAGTCAACCAGATCATAGAAGAAATGATAAAAAA 

GluAlal laTyrVa 1A 1 aT r p Va IP roA 1 aH i • LyaG ly I 1 eG lyGlyAaoClnG iuVal 
GGAACCAATCTATGTTCCATCGCTCCCACCCC ACAAACCCATACCCCCAAACCACCAACT 

4000 

AapHiaL#uValS«r Cl&Clyl 1 «ArgC In Va 1 L#uFh« LtuG luLy« I 1«G luFroAl • 
ACATCATTTACTGAGTCAGGGTATCAGAC AAGTGTTCTTC CTGCAAAAAATAGACCCCGC 

• ♦ • • • • 

GlaGluGluHiaG luLyaTyrHisStrAanVa lLyaCluLauSarliiaLyiPhaGiy I It 
TCAGGAAGAACATCAAAAATATCATAGCAATGTAAAAGAACTGTCTCATAAATTTGGAAT 
« 4100 « . • • 

ProAanLeuVa lAlaArgClnllaValAanSarCyaAlaGlnCyaCloGlnl-yaGlyGlu 
ACCC AATTTAGTGGC AAGGCAAATAGTAAACTCATGTGCCCAATGTCAACAGAAAGGGGA 

4200 

AlalUHiaC lyGlnVa lAtnAUG luLauG lyThrTrpGUMetAapCyaThrHi«L«u 
AGCTATACATGCGCAAGTAAATCCAGAACTAGGCACTTGGCAAATGGACTCCACACATTT 
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ClvClyLysZl«Il«Il«ValAlaValIi«?alAl«a«sClTFh«Il«01uAl«GlaV«l 

agaaggaaagatcattatagtagcagtacatgttgcaagtggatttatagaagcagaagt 

. * . 4300 1 

II f roCloClu8«rGlyArf GlnThrAl«L«uPh«L«uL«uty«L«uAl«S«rArtTri» 
CATCGCACAGGAATCAGGAAGACAAACAGCACTCTTCCTATTGAAACTCOCAAGTAGCTG 
•••••• 

ProXl«rhrHl«L«ttHi«ThrA«pA*aGlyAlaA«oPb«Tbr8«rGlaGluV*lLyaMat 

GCCAATAACACACTTGCATACACATAATGCTCCCAACITCACITCACACGAGGICAACAt 

4400 . , 

cgtagcatgctgcatagctaxagaacaatcciticgagtaccttaJa!?cc2cacagcc^ 

ClyV«lV«lGlqAl«M«tA«ttHi#Hi#LettLyiA.nGlBll«5 # rAr«Xl«AraGluCl n 
AGGAGIAGTAGAAGCAATGAATCACCATCTAAAAAACCAAAf AAGTA^OAAICAGAG^ 

• ' • • 
Al«ABnThrIl«GluThrIl«V«lL«uM«tAl«Il*Hi«Cy»M«tA«nPh«Lv.Ar»A^. 

GGCAAAIACAAXAGAAACAATAGTACTAATGCCAATTCATTOCAIG^TITTAA 

• • • 4600 

gcggggaaiaggogatatcactccatcacaaacattaaicaataJgaicaccI^^^^ 

CluIleGlnPb«L«uGlnAl«Ly.A#n8«rLyiL«uLyiA«BPh«ArBV«lT«r»h«A^. 

ACAGATACAATTCCTCCAAGCCAAAAATTCAAAATTAAA^G^AITITCGGGI^ 

• 4700 # 

A?i!fJ^ A 5 SA *£ GUL * uTrpLy,Cl y ProCl y Gl »l««»I.«uTrpLy«CiyOluGlyAi« 
AGAAGGCAGACATCACTTGTGGAAAGGACCTGGCGAACTACTGTOGAAAOGAGAAM 

AGTCCTACTCAAGGTAGGAACAGACATAAAAATAATACCAAG^ 

ArgA.pT y rClyG ly ArgCl 0 Gl«M.tA.pS«rGly8.;ai.L,«CluGlyAl.ArgG;u 
CA C A CA rT.;i?^ lUA * pLy * ArgTrpIleV * lValProThrT *P A '« v *l p «GlyAr» 

CAGACACTATGGAGGAAGAC AAGAGATCGATACTGGTTCCCAC CTGGAGGGTGCCAGGGA 



4900 

AipGlyGluM«CAla 



^ < .!! , ™^ 1 ' yi I rpHi " SerLettV,iLT,T,rL,uLy,TyrI 'y» ThrL y"A»pL«ttCiuLyt 

CCATGGAGAAATGGCATAGCCTTGTCAACTATCTAAAATACAAAACAAAGGATCTAGAAA 

V«lCy,TyrV.lProHi.Ui.Ly.V.lClyTrpAl«TrpTr P ThrCy.*SerAr«V«lil. 

AGGTGIGCTATGTTCCCCACCATAAGGTGCGATCGGCATCCTCGACTTGCAGCAGGGIAA 

5000 

T P ^! P Ifir«^ LyaClyABnSerHl « L ««CluIleClnAi«TyrTrpAsnLtoTbrProClu 

TATTCCC ATTAAAAGGAAACAGTCATCTAGAGATACAGGCATATTGCAACTTAACACCAC 

AAAAAGGATflec?^ 

aaaaaggatggctctcctcttattcactaacaataacttgg tacacagaaaagttctgga 

rA A !?^J^^ OA " pCy,AlaA,pV * 1 ^ eull€UisS « rT »»rTyrPh«ProC y .PheThr 

cacatgitaccccagactgtccacatctcctaaiacatagcacttAtttcccttgcttJa 

5200 

CAGCAGGTGAAGTAAGAACAGCCATCAGACGGGAAAAGTTATTGICCTGCTGCAATTAIC 

CCCGAGCTCATACAGC CCAGGTACCGTCACTTCAATTTCTGCCCTTAGTGGTAGTGC A AC 
• 5300 
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M«tXhrA«pFr ArgC luThr V* iPr Pr G lyA »n8«rG lyG luCluThr X 1 «G ly 
A«ttA«pArgProGlaArgA»p SerThrThrArgLytG loArgArg Ar»A»pTy rArgArg 
AAAATCACAGACCCCAGACAGACACTACCACCAGGAAACACCCCCGAAGAGACTATCCGA 

• • • • • 5400 
G luAl*PheA l*TrpL«uA«nArgThrV# tCluAlal l«A»nArgG luA l« V* lA»nHi§ 

ClyL«uArgLeuAi*Ly»GlnA#pSerArg3erHiiLy»G lnArgS«r3«rGluSerPro 
GACGCCTTCGCCTGCCTAAACAGGACAGTAGAAGCCATAAACAGAGAACCACTGAATCAC 

• •••*. 

LauProArgG luL.au XlaPb«GlnValTrpGlnArg9erTrpArgTyrTrpHiaAapGlu 

ThrProArgThrTyrPbaProCly ValAl«01uValLauGluXl«I.auAla 
C7ACCCC0A0AACTTATTTTCCAGGTGTGGCAGAGGTCCTGGAGATACTGGCATGATGAA 

3500 

ClnGlyHatSarGluSarTyrTtarLyaTyrArgTyrLauCyallalleClnLyaAlaVa 1 
CAACGGATGTCAGAAAGTTACACAAACTATAGATATTTGTGCATAATACAGAAAGCACTG 

• • • ... 
TyrM«tUi«V*lArgLy«GlyCy B ThrCy«L«uClyArgC lyUiflClyProGlyGlyTrp 
TACATGCATGTTAGGAAAGGGTGTACTTGCCTCGGGACGGGACATGGGCCAGGAGGGTGG 

5600 .... 
ArgProClyProProProProProProProGlyL«uV«l 

HatAlaGluAlaProThrClu 
AGACCAGGGCCTCCTCCTCCTCCCCCTCCAGGTCTGGTCTAATGGCTGAAGCACCAACAG 

• • * . . 3700 
LeuProProV«lAapGlyTbrProLeuArgGluProGlyA«pGluTrpXl«Xl«GluZl« 

AGCTCCCCCCGGTGGATGGGACCCCACTCACGGAGCCAGGGGATGAGTCGAIAATAGAAA 
...... 

LeuArgG lul leLyaGluG luAiaLeuLyaHiaPbeAapProArgI.auLauXlaAlaI.au 
TCTTCACACAAATAAAACAAGAAGCTTTAAACCATTTTGACCCTCGCTTGCTAATTGCTC 

5800 

MetGluThrPreLeuLyaAlaProGlu8er8erLeu 
ClyLyeTyrXleTyrTbrArgHieClyAapTtarLeuGluGlyAlaArgCluLeuXlaLya 
TTGGCAAATATATCTATAC TAGACATGGAGACACCCTTGAAGGCGCCAGAGAGCTCATTA 
»•••». 
LyaSerCyaAanC luProPhe Ser ArgThr SerCluC InAapVe lAlaThrC loC luLeu 

V* lLeuCloArgAlaLeuPheTtarHiePheArgAlaGlyCyeCiyHieSarArglleGiy 
AAGTCCTGCAACGAGCCCTTTTCACGCACTTCACACCACGATGTGGCCACTCAAGAATTG 

5900 .... 
ALaArgGlnClyGluG luIleLeuSerG LnLeuTyrArgProLeuG LuTbrCy lAmAm 

G InThrArgG lyG lyA «nP ro Leu S e r A I a X 1 eP r o Thr P r o Arg AaoMe t C In 
G CC AGAC AAGGGGAGGAAATCC TCTCTCACCTATAC CGAC CCCTAGAAACATGCAATAAC 

6000 

SerCyaTyrCyaLyeArgCyeCyaTyrHiaCyeClnMacCy a P ha LauAanLyaG lyLeu 
TCATGCTATTGTAAGCCATGCTGCTACC ATTGTCAGATGTGTTTTCTAAACAAGGGGCTC 

•••••• 

G lyl l«Cy»TyrC luArg Ly«G LyArgArgArgArgThrProLy«Ly» TbrLy* Thrill 

M«tA«nC luArgAltAspCluG luG lyLcuG 1 nArgLy* L«uArgL«u XI « 
GGGATATGTTATGAACCAXAGGGCAGACGAACAAGCACTCCAAAGAAAACTAAGACTCAT 

6100 

ProSerProThrProA«pLyt 
ArgLeuLeuHisGlnThr 

MttMttAtnGlQLcuLeuIl«Ala.IL«L«uLtuAL* 

ccgictcctacaccagacaagtgagtatgatgaaicagctccttattgccattttattag" 

• • • • • . • 

SarAXaCyaLauValTyrCyaThrGinTyrValThrValPheTyrGly ValProThrTrp 
CTAGTGCTTCCTTAGTATATTGCACCCAATATGTAACTGXTTTCTATGGCGTACCCACCT 

6200 . . . . 
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Ly«A«nAifl^Il«ProLeuPh«Cy»AUTbrArgA»n^fc»pThrTrpClyThrIl« 

GGAAAAATCCAACCATTCCCCTCTTTTGTGCAACCAGAAATAGGGATACTTGCGCAACCA 

6300 

ClaCy«L«ttProA»pAtnA«pA«pTyrCLnClttIleThrLe«A«nV«lThrCluAl«Phe 
TACAOTCCtTGCCTGACAATGATCATTATCACGAAATAACTTTGAATGTAACAGAGCCTT 

. • ♦ * * 

A«««i«Tr»A»oAinThrV«lThr01uClnAl«IleCluA«pV»lTrpHi»LtuPheClu 

IICAICCAICCAATAATACACTAACA0AACAACCAAIACAACATCTCTGCCATCTATTC6 

6400 

ThrS«rIlaLytProCy« V«lLy«LtuThrProLeuCysV» lAl*M«t Ly»Cy«S«r S#r 
ACACATCAATAAAACCATCTCTCAAACIAACACCTTTAICTCTACCAATCAAAXCCACCA 

• ••••• 
ThrCluS€rS«rThrClyA»nA8nThrIhr8«rLy«8«rThr8«rThrIhrThrThrThr 

GCACAGACACCACCACACCGAACAACACAACCTCAAAGAGCACAAGCACAACCACAACCA 
. 6500 ... * 

ProThrAspGlnCluGlaGlull«8«rGluA«pTbrPrpCy«Al«ArgAlaA«pAsoCya 
CACCCACAGACCAGGAGCAAGAGATAAGTGAGGATACTCCATGCGCACGCGCAGACAACT 

6600 

8«rGl7LauGlyGluGluGluThrIl«A«aCy«GlaPh«AaaM«tTnrGlyl«uG LuArg 
GCTCAGGATTGGGAGAGGAAGAAACGATCAATTGCCAGTTCAAXATGACAGGATTAGAAA 

• ••••• 

AapLyaLyaLyaG lnTyrAaoOluThrTrpTyrSarLyaAapVal ValCyaGluThrAao 
CAGATAAGAAAAAACAGTATAATGAAACATGGTACTCAAAAGATGTGGTTTGTGAGACAA 

. t . 6700 4 . 

AanSerThrAanGinThrGloCyaTyrMeCAanHiaCyaAaaTbrSarValllaThrGlu 
ATAATACCACAAATCAGACCCAGTGTTACATGAACCATTGCAACACATCAGTCATCACAG 

• ••••• 

SerCytAtpLy*HitXyrXrpA»pAlmXl«ArgPhtArgXyrCy»Al*7roPToG lyXy r 
AAXCAXCXCACAAGCACIAIXGGGAXGCXAIAAGGXXXA0ATACICXCCACCACCC60TI 

6800 . 
AlaL*uLcuArgCyiAsoAspXhrAsnXyrS«rG ly P h«A laP roA«nCy • 8 arLy • V« i 
ATCCCCTATTAACATGTAATGATACCAATTATTCAGGCTTTCCACCCAACTGTTCTAAAC 

6900 

ValAlaS«rItarCyiXhrArgHetM«tC iuXhrC lnlhr 8«rXhrXrpPheC iyPbeAan 
XACXACCXXCXACATCCACCAGGAXCAXGCAAACGCAAACXXCCACAXGCXXXGGCXXXA 

• • • ■ • • • 
GlyXtarArgAlaGiuAanArgXhrXyrllaXy rXrpH iaC iyArgAip AanArgXhr lie 

AXGGCACXACAGCAGAGAAXAGAACAXAXAXCXAXXGGCAXGGCAGAGATAATAGAACXA 

7000 

I l«S«rLauAaaLytXyrXyrA«nLauS«r LeuHisCyaLyaArgProGlyAaaLysXhr 
XCAXCAGCXXAAACAAAXAXXAXAAXCXCAGXXXCCAXXGXAAGAGGCCAGGCAAXAAGA 
*••••• 

VaiLyaGlnlleMatLauMatSarGlyHiaVa lPheHiaSerHiiXyrC lnProIlaAsn 
CAGXGAAAGAAAXAAXGCXXAXGXCAGGACAXGXGXXXCACXCCCACXACCAGCCGAXCA 

• 7 100 • • . • 
LyaArgProArgG InAl aXr pCy aXrpPh«Ly»G iy LysXrpLyi AtpAiaMetC loG lu 

AXAAAAGACCCAGACAAGCAXGGXGCXGGXXCAAAGGCAAAXGGAAAGACGCCAXGCAGG 

7200 

V« ILy iG luXhrLeuAl«Ly tUisP roArgTy rArgG iy Xhr A»nA»pIbrArgA«n 1 1 e 
AGGXGAAGGAAACCCXTGC AAAACAXCCCAGGTATAGAGGAAC CAATG ACACAAGGAA TA 

• * • ' • • * 
SarPheAiaAlaProC iyLyaC ly SarA*pProG iu ValAlaXyrMatXrpXhrAanCy t 

XIAGCIXXGCACCGCCAGCAAAAGCCXCACACCCAGAAGXAGCAXAC AXGXGGACXAACX 

7300 

ArgGiyCluPh«L«uXyrCysA«aM«cXhrXrpPheLeuAt&XrplltGluAtoLy»Xbr 
GCAGAGGAGAGXXXCXCXACXGCAACAXGACXXC6XTCCXCAAXXGGA7AGAGAAXAAGA 
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Ui»A**A»nIvrAUPr Cy aHia I leLyaG lol 1 el laAaalbr TrpUULya Va 1G ly 
CACACC0CAATTATCCACCCTCCCATATAAACCAAATAATTAACACATCCCATAACCTA5 

7 400 .... 

ArmAanValTyrLauProPr ArgCluC lyGlulauSarCyaAao8arTorValThr8ar 
CCACAAATCTAtATTTCCCTCCCACCCAACOOOACCXCTCCTOCAACTCAACACTAACCA 

I • 7500 

Il-Il«AlaAanI leAapTrpCldAaBAaaAaaGleThrAanlleTtarPhaSarAlaGlu 
CCATAATTCCTAACATTCACTCCCAAAACAATAATCACACAAACATTACCTTTACTGCAC 

• • • 

ValAlaGlulauTyrArgLauGluLauClyAapTyrlyaLauValGluIlaThrProIla 

ACGTCGCAGAACTATACAGATTCGAGTTGGGAGATTATAAATTCCTACAAATAACACCAA 

7600 

GlyPhaAlaProTbrLyaG luLyaArgTyrSarSarAiaHiaGlyArgHiaThrArgG ly 
TTGGCTTCGCACCTACAAAAGAAAAAAGATACTCCTCTGCTCACGGGACACATACAAGAG 

,.««•• 
ValPtaaValiauGlyPhaLauGlyPbaLauAlaThrAlaClySarAlallatClyAlaAla 

GTGTGTTCGTGCTAGOGTTCTTGGCTTTTCTCGCAACAGCAGGTTCTGCAATGOOCOCCG 

7700 • 

SarLauThrValSarAlaGlnSarArgTtarLeuLauAlaGlyllaValClttClnGlftCIn 
CGTCCCTGACCGTGTCGGCTCAGTCCCCGACTTTACTGGCCGGGATA0T6CAGCAACAGC 

7800 

GiaLauLauAapVa 1 Va lLysArgG InC InG luLauLauArgLauThrTa lTrpOlyThr 
AACACCTGTTGCACGTGGTCAACAGACAACAACAACTGTTGCGACTGACCCTCTGOGGAA 

Ly»A»nL«uGlnAl*ArgV«lTbrAl«Il«GluLy*TyrL«ttG laAapGlaAlaArgLau 

CGAAAAACCTCCAGGCAACAGTCACTGCTATAGAGAAGTACCTACAGCACCACGCGCGGC 

7900 

AanSarTrpGlyCyaAlaPheArgGiaValCyaHiaTbrTbrV^lProTrpValAanAap 
TAAATTCATGGGGATGTGCGTTTAGACAAGTCTGCCACACTACTGTACCATGGGTTAATC 

SarLauAlaP roAapTrpAapAaaMatTbrTrpG loGluTrpG luLyaG laVa lArgTy r 
ATTCCTTAGCACCTGACTGGGACAATATGACGIGGCACCAATCGGAAAAACAAGTCCGCT 

i . 8000 t . • •• 

LauGiuAlaAaoI laSarLyaSerLeuGluGlaAlaGlallaGlaGlaGltiLyaAanMat 

ACCTGGAGGCAAATATCAGTAAAAGTTTAGAACAGGCACAAATTCACCAACACAAAAATA 

TyrCluLeuG laL ya LauAta Ser Tr pAap 1 1 ePh«G lyAaaTrpPba Aap LauThr Ser 
TCTATGAACTACAAAAATIAAATAGCTGGGATATTTTTCGCAATTGGTTTGACTTAACCT 

• ••••• 

TrpValLyaTyrlUCiaTyrCiy Va lLeu I lal laVa lAlaVal I laAlaLeuArg I la 
CCTGGGTCAACTATATTCAATATGGAGTGCTTATAATAGTACCACTAAIAGCTTTAAGAA 

8200 

ValllaTyrVa lValC InMet Leu8 er ArgLauArgLyiG lyTyr ArgPro Va IP ha Bar 
TAGTGATATATGTAGTACAAATGTTAAGTACGCTTAGAAAGGCCTAIAGGCCTCTTTTCT 

SarllaSatTbrArgTbrGlyAapSarGlaPro 
AaaProTyrPrpCiaGlyPrpClyTbrAlaSarGln 
SarProProClyTyrllaClaGlallaHiallaHiaLyaAapArgGlyGlaProAlaAan 
CTTCCCCCCCCCCTTATATCCAACAGATCCATATCCACAAGGACCCGGCACAGCCACCCA 

8300 . 
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ThrL7»Ly»CULy«LyiThrV«lGluAi«ThrV*lGluT&rA»pThrCi7Fr GlyArg 
A.-ArgA.nArf AtiAr»ArgTrpLy»ClnAr»IrpArgGlBll«L«uAl«L»uAl«Aip 
Gl»GlaThr01uGluA«pGlyGly3«rA»nGl7GlyA«pArgTyrTrpProTrpProIl« 
ACaUOAAACAGAACAAGACCCTCGAAGCAACOGTGGAGACAOATACTCCCCCTCGCCCA 

8400 

8#rIl«IyrThrPheProA«pProProAl*Aip8«rProL«uA«pClnThrIl«ClnHi« 
Al«TyrIl«HiiPh«LtoIl«ArgGlBL«uIl«ArgL«ttL«uThTArgL«uTyr8erH« 

TAGCATATATACATTTCCTGATCCGCCAGCTGATTCGCCTCTTGACCACACTATACACCA 
• ••••• 

L«uG InG ly L«uThr I 1 eC InG luLeuProAipProProThrHisLeuProG luS«rC In 
CytArgA«pL«uL«uS«rArgScrPb«LcuthrL«uGlnL«uIL«TyrGlnA«tiL«uArg 
TCTGCACGCACTIACTATCCAGCACCTTCCTGACCCTCCAACTCATCTACCAGAATCTCA 

8500 

ArgL«uAlACluthr H« tC 1 yAla 8«rG ly 8«r LyiLy* 

AapTrpLcuArgLeuArgThr AlaPheL«uGlnTyrGlyCy«GlutrpX laGlnC luAl« 
CA6ACT0CCTGAGACTTACAACAGCCTTCTTCCAATATGGGTGCCAGTGGATCCAA0AAG 

Hi*S«rArgProPtoArgClyL«uGlnCluArgLeuL«uArgAl*ArgAl«GlyAl»Cy B 

PheClnAlaAlaAUArgAUThrArgGluTbrL«uAl*GlyAUCy»Arg01yL«uTrp 
CATICCAGCCCGCCCCGAGGGCTACAAGAGAGACTCTTGCCOGCGCGTGCAGOOCCTTCT 

8600 • 
GlyGlyTyrTrpAsnGluSerGlyG lyGluTyrS«rArgPh«GinCluClySerA«pArg 

ArgVolLauC iuArg 1 1 eG lyArgC ly 1 1 «LeuAl«V« lProArgArg I UArgG InG ly 
GCAGCGTATTGGAACG AATCGGGAGGGGAATACTCGCGGTTCCAAGAAGGATCAGACAGG 

8700 

GluGlnLysSerProSerCyaCluGlyArgGlnTyrGloG InG lyAipPhcMecAmThr 

AlaGluIl eA ULeuLeu 
GAGCAGAAATCGCCCTCCTGIGAGGGACGGCAGTATCAGCAGGGAGACTTTATGAATACT 

• • » • « • 

ProTrpLy«A«pProAl«Al«GluArgG luLy* AgnLtuTy rArgG InG loAinM«t A«p 
CCATCCAAGGACCCAGCACCAGAAAGGGAGAAAAATTTGTACAGGCAACAAAATATGGAT 

8800 • • 

A»pVaXAtpSarA«pA»pA*pA«pClnV«lArgVal8trV*lTbrProLyf ValProL«u 
GATCTA'GATTCAGATCATCATGACCAAGTAAGAGTTTCTGTCACACCAAAACTACCACTA 

ArgProMe t ThrHi«ArgLeuAl*I l«A»pMet SerHiaLcul leLyaThrArgC iyCly 

agaccaatgacacatagattggcaatagatatctcacatttaataaaaacaagggcggga 

. 89 00 

L«uGluGlyMetPh«TyrSerGluArgArgHi»LytIleLeuA80lLeTyrL«uGluLyB 
CTGGAACGGATGTTTTACAGTGAAAGAAGACATAAAATCTTAAATATATACTTAGAAAAG 

9000 

GluGluGlyUcI 1 eA 1 a A sp Tr pC lnA*o Ty r ThrH i »G LyP roG ly V« lArgTy rFro 
GAAGAAGGGATAATTGC AGATTGGCAGAACTACACTCATGGGCCAGCAGTAAGATACCC A 

• ♦ • • • • 
M«tPhtPheGlyTrpL«uTrpLy«L«uV*lProVt lAtpValProGlnC luG lyCUAtp 
ATCTTCTTTGGGTGGCTATGGAAGCTAGTACCAGTAGATGTCCCACAAGAAGGGGAGCAC 

9100 . 

ThrGluThrBisCy«L«uV« lHiaProAX«GlnThr8«rLyaPh«AtpA<pProBi«G ly 
ACTGAGAC TCACTGC TTAGTAC ATCCAGCAC AAACAAGCAAGTTTGATGACCCGCATCGG 

• *•**• 
GluThrLeuValTrpG luPh«AipProLeuL«uAl«Tyr8«rTyrGluAl*PheIl«Arg 
GAGACACTAGTCTGGGAGTTTGATCCCTTGCTGGCTTATAGTTACGAGGCTTTTATTCGG 

osnn .... 
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ii;SJ;^I^lIilS2l«?^«AAACACACC4ACACCT A TACITOCICACCCCACCA 
ACIAACTAACAGAAACACCTCAGACTGCACCGACTTTCCACAACQCCCTCTAACCAAOCC 
AGGGACATGGGAGGAGCTGGTGCGGAACGCCCTCATATXCTCTGTAIAAATATACCCCCT 
AGCTTGCATXGTACTTCGGTCGCTCTGCGGACAOGCTGGCAGATTGACCCCTGGGAGGTT 

cictccaccactagca2g?Jgagcctgcgtcttccctgctagactctcaccagcacttgg 

CCGGTGCTGGGCAGACGGCCCCACGCTTCCTTGCTTAAAAACCTCCTTAAIAAAGCTGCC 

• • * ' 

AGTTAGAAGCA 
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Example 5 ; Sequences of the Coding Regions 

for the Envelope Protein and GAG 
Product of the ROD HIV-2 Isolate 

Through xperimental analysis of the HIV-2 ROD isolate, the 

following sequences were identified for the regions encoding the 

env and gag gene products. One of ordinary skill in the art will 

recognize that the numbering for both gene regions which follow 

begins for convenience with "1" rather than the corresponding 

number for its initial nucleotide as given in Example 4, above, 

in the context of the complete genomic sequence. 
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Envelope sequence 



M«tM«€Asa01nLtuL«uIl«Al«Xl«L«uL«uAU8«rAUCyt 

iTCATOAATCACCTCCTTATTGCCATTTtATTACCTACTCCTTOC 

• • • • 
L«oV»lTyrCy»TbrClnTyrV«lTbrV*lPh«TyrCly7»lPT© 
TTACTATATTGCACCCAATATCTAACTGTTTICTATOGCOTACCC 

ThrTrpLy»A»ftAlaTtarIi«ProL««Ph«Cy»Al*ThrArgA«» 
ACGTG«AAAAATOCAACCATTCCCCTCTTTTOTOCAACCACAAA? 
100 

ArgAspTferTrpGlyThrllaClaCyclaaPTaAspAsaAapAip 
AOOCATACTTOCCOAACCAIACAOTOCTTOCCTCACAATCATOAt 

TyrciaCluIl«ThTL«uA«aVAlTbrGlttAlaPh»A»pAUTrp 
TATCAaOAAATAACTTTOAATOTAACAOAOCCtTTTaATCCATflfl 

200 • . • 

AsnAtoTbrftlTbrGluCloAlmlUOlmAapfalTrpligL** 
AATAATACACTAACACAACAAOCAATACAAGATCTCTGOCATCTA 

• • • • • 
Ph«GlaThrttrXl«Ly»PreCy«ValLy»L««ThrFreL««ey» 
TTCGACACATCAATAAAACCATOTCTCAAACTAACACCTITATOT 

soo . 

V*lAl«M«cLyiCyi8tr8«rThrOl«i«ri«rThr01yA«uA«» 
CTACCAATCAAATOCACCACCACAQA0AGCA0CACAOG0AACAA0 

TbrThrl«rLy«8«rTbrS«rTbrTbrTbrThrTbrProTbrAtp 
ACAACCTCAAACA0CACAA0CACAACCACAACCACACCCA4AGAC 

400 

GlnGluGlnGluIl«S«rGluAip?hrProCysAl«ArgAUA*p 
CAGGAGCAAGAGATAAGTGAGCATACTCCATGCGCACGCGCAGAC 

A«nCy»S«rGlyL«uClyGluGlttGWThrIltA»oCyfClnPii 
AACTCCTCAOOATTCCCA0A0GAA0AAACCATCAATTGCCAGTTC 

• • • • 
AanM«tTbrGlyL«u01uArgA*pLyiLysLy»GlnTyrA«nGlu 
AAIATGACAGGATTAGAAAGAGATAAGAAAAAACACTATAATGAA 

300 ... 
ThrTrpTyr8«tLy«A«pV*lV*lCy$C luThrAtnA«ai«rThr 
ACATGGTACTCAAAAGATGTGGTTTCTCAGACAAATAATAGCACA 

• • • • 

A«nGlaXhr01aCytTyrM«CAa&EiaCyiA«a?hrMrValll« 
AATCAGACCCAGTQTTACATGAACCATTCCAACACATCACTCATC 

• 600 • # » 
ThrGlu8«rCyiAapLyiHi*TyrTrpA»pAlaI IftArgP hcArg 
ACAGAATCATCTGACAAGCACTATTOOOATGCTATAACCTTTAOA 

. . . • 

TyrCy»AL*ProProC ly?yrAl«L««L«ttArgCy lAftaAspTbr 
TACTGTCCACCACCGGOTTATCCCCTATTAACATCTAATGATACC 

700 . • 

A«nTyrft«rGlyPb«AlaPToAiiiCyt8*rLy»TalT«lAUI«r 
AATTAlTCACOCTTt«CACCCAACTGTTCTAAAGtAOTA«CrrCT 
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TbrCyatbrArgMatMatClttTbrC laTarSarTaxti pPbaGly 
A CATCC ACCACCATCATOCAAACO C AAACTT C CACATOGTT TGGC 
. ♦ . 800 • 

PbaAaaClyThrA*gAlaGluAaaArgXarTy*XlaTyrTrplta 
TTTAATGGCACTAGAOCAGAGAATAGAACATATATCTATTOOCAT 

• • • • 

GlyArgAapAaaArgTbrXlaXlataTLauAaaLyalyrTyTAaa 
C GC AGACATAATAGAACTAT CAT C A6 C Tt AAAC AAATATTATAAY 

• • • too 

LauSarLauliaCyaLyaAvgf roGlyAaaLy aTarfa llyaG la 
CTCACTTT0CATI0TAACACCCCA0CCAATAA0ACA0T6AAACAA 

Il«M«tL«ttM«tt«rClyHiiV4lFb«Hi»l«rIl*tyrOlal*r» 
ATAAT0CTTATGTCAGGACATGTG7TTCACTCCCA6TAGCA0CC9 

• >[■ • • • • 
XlaAaaLyaArg ProArgOlaAlalrpCyaTrpPbaLyaOlyLya 
ATCAATAAAACACCCAOACAAGCATOOTGCTOOTTCAAAGGCAAA 

1000 . . . 

TrpLyaAayaiaMatGlaGlaValLyaTarLaaAlaLyalUrro 
IGGAAAGACGCCAYOCACGAGCTOAAGAGCCTTOCAAAACATCCC 

ArgTyrArgClyTbrAaaAtyTbrArgAaellalarPbaAlaAU 
AGGTATA6ACCAACCAATQACACAAGGAATATTAGCTTT0CAGC0 

1100 

FroClyLyaGlytarAapFroGlaVaUWTyrlUtTrpTfcTAaa 
CCAGGAAAAGGC TCAGACC CAGAAGTAGCAT AC ATGTCGACTAAO 
•■ • • • • 

CyaArgClyGluPbaLauTyrCyaAsaMatTbrTrpPbaLaaAaa 
IGCAGACOAGAGTTTCTCTACTGCAACATGACTTOGTTCCTCAAT 

1200 

TrpXlaOlaAaaLyatarHiaArgAaaTyrAUFroCyallaXla 
fGGATAGAOAATAAGACACACCGCAATTAtGCACCGTGCCATATA 

• • • • • 

LyaGlallallaAaaTbrlrpHiaLysValOlyArgAaaValTyr 
AAGCAAATAATTAACACATGGCAZAA0GTAGGGAGAAATGTATA7 

1300 

L«uProProArgGlu0lyCluL«u8«rCysAanS«rThrValTbr 
TTGCCTCCCAGGGAAGOGGAGCTCTCC tGCAACTCAACAGTAACC 

• • • • • 

•«rXlaXlaAlaAanXl«AapTrpGlaA«aAaaAaaClaTbrAaa 
AGCATAATTGCTAACATTGACTGGCAAAACAAf AATCAGACAAAC 

• • • • 
ZlaTarPhatarAlaCluValAl'GlvLaaTyrArgLauGluLaa 

ATTACCTTTAGTGCAGAGGT00CAGAACTATACAGAT7GGAGTTG 
1400 . . 

GlyAapTyrLyaLauTalGUIlaTbrProXlaGlyPbaAlAPro 
GCAGATTATAAATTCCTACAAATAACACCAATTGGCTTCGCACCT , 
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ACAAAACAAAAAACATACTCCTCTQCTCAC00CA0ACATACAA8A 
1300 

ClyV«l»h«V*H.«uClyPh«L«ttCl3r»h«L««Al«tbrAl«0ty 
CGTCTCTTCOTOCTACGGTTCTTCOGrTTTCTCCCAACAOCAflOT 

8«rAl«H«tClyAl*ArsAi*l«rL«u?hrTalS«rAla01aS«ff 

TCTOCAATCCOCOCTCGACCCTCCCTaACCCTCTCGGCTCAGTCC 

1600 

ArgTnrL«uL«uAl»0lyIl«V«lClaCln0loClnGl»L«al«» 
CCCACTTTACTCCCCCC0ATACTCCA0CAACAGCAACA0CT6TTC 

• • • • 
AtpV«lV«lLy»AriGlnCLaGluL«uL«uArsL«uf hrTtlf rp 
GACCTOGTCAAOACACAACAAOAACTOtTOCCACTGACCCTCtW 

1700 

C I ylhrLy •Aial.uClnAlmAri f « lTfcrAUXUOlulytty* 
CGAACCAAAAACCTCCAOOCAACAOTCACTCCTATACACAAOTAC 

• ■ • • 
L«uGlsAip61aAl«Arf L«uAsn8«rtrpGlyCy«Al«f fc«A*g 

CTACAC0ACCA0QCGC0CCTAAATTCATCCC0AT0T6C0TTTA«A 

GlnV«lCy»Bi»ThrIhr?*lf ro?rp?alA«nAipl«rl,«*Ala 

CAACTCTCCCACACTACTOTACCAICOOTTAATGATTCCTTAGeA 

• • • • 

ProA»pTrpAipA§nM«cIhrTrp01ttGluTrpGltilyiGl«?Ai 
CCTCACTGOOACAATATCACGT0GCAGQAAtCO0A.U.UCAAOtC 

• • • • • 
Ar g Tj*L • uG 1 uA 1 «A • nl 1 • 8 • r Ly • 0«r LVuG iuG In At atflm" 
CGCTACCTCCAOCCAAATATCAOTAAAAGTTTAOAACACCCACAA 

1900 

Il«GlaGlaGluLy«AanM«tTyrGluL*uGlBLytL«uA«at«r 
ATTCAOCAAOAOAAAAATATCTATCAACtACAAAAATTAAATAOC 

• • • • • 
TrpA«pXltPh«G lyAtoTrpPb«AtpLtuThr8«rTrp?AlLy« 

TGGGATATTTTTCGCAATTGGTTTGACTf AACCTCCTGGGTCAAG 

2000 

TyrIl«Glatyr01yy«iL«ttZl«Il«V«lAlaV«lf l«Al«L«tt 
TATATTCAATATGGAGTGCTTATAATAGTAGCACTAATAOCTTTA 

Argli»V»lIl«TyrV*17*lClnU«tL«u5«rArgL«uArtLy» 
AGAATAGTGATATATGTACTACAAATGTTAAGTAGGCTTAGAAAO 

2100 

GlyTyrArgProV*lPh«8«r8erProProGlyTyrXl«61a*** 
GCCTATA6GCCTGTTTTCTCTTCCCCCCGCCCTTATATCCAATA0 
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Il«Bl«XlaHisLjsAspAr8Glj61aPxeAl«Atn61ttOXttThr 
ATCCATATCCACAACGACCGGGGACAOCCAGCCAACGAACAAACA 

2200 

«laGlaAspGlyGlyl«rAtaGly61jA«pArgT7rTrpProTrp 
«AAGAAGACGGtGGAAGCAACGGTGGA0ACA6ATACTG0GCet0a 

•%©IX«Al«TyrIl«Ei«Ph«I,«ttUtArgClnL«uIUArgL«tt 
OCCATACCATATATACATTTCCTCATCCCCCACCTCATTCQCCTC 

• • • • 
1*QTarArgL«uT7rS«rXl«CysArgAipI.«uL«uS«*Axgfar 
TTGACCAGACTATACACCATCTGCAGGGACTTACTATCCAGGAOC 

t>00 .... 
rk«L«ttThrLattGlaL«ttZl«T7rGtmA»«X.««ArtA«pTryL«« 
VTCCTOACCCTCCAACTCATCTACCAGAATCTCAGAGACTOGCTG 

Ar»L«ttATgthrAl*Ph«L«u0latyr«lyC7«Cltttrpil«0la 
AGACTTAGAACAGCCTTCTTGCAATATOGOTGCOAOTOOATCCAA 
2A0O 

G luA 1 «V hmO InA 1 «A 1 *A 1 «A r |A 1 •ThrArg 0 lot hrLaoAl • 
GAA0CATTCCAG0CCGCCGCGACGGCXACAAGACA6ACT0CTGCG 

• • • •• 
ClyAHCytArgOlyWuTrpArgT* U^uCluArg 1 UG l7Arg 
GGCCCGTCCAOCCCCTTCTCCACCGTATTQCAACOAATCCOGAOG 

2500 

«lyIl«L«uAl»?«l?roArgArgIl«ArgClnOlyAl«ClttIl« 
C0AATACTCGCGGTTCCAAGAAGGATCA6ACAG6GAGGA0AAATC 

Al4L«uL«u*«*0l7TbrAl«Val8«rAl«Gl7Argl«uT7rGlo 
GCCCTCCTGTGA6GGAC0GCAGTA7CACCAGGGAGACTTTAT0AA 

• • • 2600 • 
Xyr8«rK«tClttGl7Preg«rg«rArgL7sGl7GlttL7«f h«T«l 
TACTCCATGGAAGGACCCACCACCAGAAAOGGAGAAAAATTTQIA 

ClnAUThrLytTyrGly 
CAGGCAACAAAATATGGA 

• « 
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Gag sequence 



Hat*iyAiAArf>aaSarvalX,auArfClyLyaLyaAlaAepOla 
ATOQGttCCCACAAACTCCCTCTTGACAGGGAAAAAAOCAGAtGAA 

L«uCluArtIl«Arf L«ttAr»Pro«ly«lyL7«I.7»Ly«Ty»Art 

ttaoaaaoaatcagcttacccccccccccaaaoaaaaaotacaoo 

- • •• 

I.auLysHiaXlaValTrpAlaAlaAa«l.jaL««AtpArsf a««l7 
CTAAAACATATTCTCTGOCCACOOAAYAAATTOOACAOAf TC^fiA 

100 . • • 

LauAlaGlaSarLauLauGlutarLycGiaOlyCyaOlaLyaXla 
TTAGCAGAGAGCCTGTTGGAGTCAJLAACAGGGTTCTCAAAAAATT 

L«uThrV«lL«ttA«p?ToM«tT»lProThr0lyt«r01ttA»aI^» 
CTIACAGTTTTAOATCCAAIOCTACCCACACCTTCACAAAATTtA 

200 . 
X.ya8arLau?haAanTBrValCyaTalXlaTrpCyaXl«Bi*AX* 
AAAAGTCTTTTTAATACTCTCTGCCTCATTTCQTGCATACACCCA 

• • .• ■ • • 
GluGluLyaValLyaAspThrGlttGlyAlaLyaGlaXlaVAlArf, 
GAAGAGAAAGTGAAAGATACTGAAGGAGCAAAACAAATAGTGCOG 

sot 

ArgHiaLauValAlaGluTarOlyTarAlaCluLyalUtf roftar 
AGACATCTAGTOGCAOAAACAGGAACTGCAOAGAAAATGCCAAGC 

Thr3«rArgProTBrAiaProi«rS«rCluLy»GiyOlyA»aryt 
ACAAGTAGACCAACAGCACCATCTAGCGAGAAGGGAGGAAATTAC 

400 

Pro V*lClDHi»V«i01yOiyA«oTyrThrHiaU«Prolau»«r 
CCAGTCCAACATCTAOOCGQCAACTACACCCATATACCCCTOAOT 

• • • • • 
ProArgThrLaaAaaAlaTrpValLy aX.au PalGluGluLyaLy-a 
C CC C GAACCC TAAATGCCTGGGTAAAATTAOf AGAGGAAAAAAAG 

PhaGlyAlaGluValValProClyPnaGlnAlaLauSarGluGly 
TTCGGGGCAGAAGTACTGCCAGGATTTCAGOCACTCTCAGAAGOC 
500 • • . • 

CyaTbrProTyrAapXlaAiaGLaHatLauAaaCya ValGlyAtp 
TGCACGCCCTATGATATCAACCAAAT6CT7AATTGTGTGGGGGAC 

HiaGlaAlaAlaMatOlaXlaXlaArjOluXlaXlaAanCluOla 
CATCAAGCAGCCATGCAGATAATCAGGGAGATTATCAATGAGGAA 
600 • • • 

Al«Al*GluTr?A»pV»101nHl«ProIl«ProGlyProL«»tr» 

gcagcaoaatogoatctgcaacatccaataccagcccccttaoga 

• • • • 

AlaGlyCloLaaArgGlurroArgGlytarAapXlaAlaGlytk* 
CCCGGGCACCTTAGAGAGCCAAGGGGATCTGACATACCAGGOAfiA 

700 . « 

TarSarThrTalGluGlttGlaXlaGlaTrpKacPhaArtProais 
ACAAGCACAGTACAACAACAGATCCAGTGGATOTTTAGCCCACAaV 
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AaaProfalPr TalOlyAaaXlalyfArgArgf raXUOUlla, 
AATCGtCTACCAOTAOOAAACATCTATAOAAOATOCATCCACATA 

800 

giyLaaGlaLyaCyeyalArgMatTyrAaaProTarABallalaa 
GCATTOCACAAGTCTOfCACGATCTACAACCCOACCAACATCCTA 

• • • • 
A»pIl«Ly iClnClyProLy ■01uFToFb»01tti«rIyrf*lAi» 

GACATAAAACAGGGACCAAAOGAGCCGfTCCAAAGCTATGTAOAT 

• . . too 

ArgFaaTyrLyagarLauArgAlaGluG laTaxAapP*oAlaV«l 
ACATTCtACAAAAOGTTGAGCOCAGAACAAACACATCCAaCAGTO 

• • • • ^ 
LytAsftTrpM«tTtarOlttThrL«ttL«ttfal01aAinAl«A«Bfr« 
AACAATTGGATQACCCAAACACTOCTAGTACAAAATCCCAACCCA 

• • • • • 
AapCyaLyaLavValLaaLyaGlyLattClyMatAaaf raTfcrLa* 
GACTGTAAATTAGTGCTAAAAGGACTA6GCATQAACCCXACSTTA 

1000 

Glu0lttMatL«ttTarAlaCytCia0ly?aiGly01yPra0lyGla 
GAACAOATCCTCACCOCCTOTCAOOOGgTAOGTCCOCCAOOCCAG 

• • • • • 
Ly«Al*Ar$L«uK«cAl«CittAiaI,tuly«CluY»lIU«lyf ro 
AAAOCTAGATTAATOGCAGAOGCCCTGAAAOAGGTCAtAOOACCt 

1100 

AUProIla?rorh«AlaAlaAlaGlaGlaAr|LyaAlarkaLya 
GCCCCTATCCCATTCOCAOCAOCCCACCAOAOAAACOCATTIAAA 

• • • - • 
CyaTrpAaaCyaGlyLyaGluGlyHlaOarAlaArgGlaCyaArg 

TGCT6GAACTGTGGAAAGGAAGGGCACTCGGCAA0ACAATGCCGA 

1200 

AlaPreArgArgGlaGlyCyaTrplyaOyaGlylyaProGlylia 
GCACCTAGAAOGCAOOOCIOCTGGAAOTGTGGTAAGCCAGOACAC 

• • • • • 
XlaMatTarAaaCyaProAapArgGlaAlaOlyPaaLauGlyLaa 
ATCATGACAAACTOCCCAGATAGACAOOCAGGTTTTTTAGOACTa 

1300 

GlyProTrpGlyLyaLyaProArgAaaPhaProValAlaGlaVal 

CGCCCTTGGGGAAAGAACCCCCGCAACTTCCCCfiTGGCCCAAGTT 

• • • • • 
ProGlaClyLauTarFroThrAlaProProValAapFroAlaTal 
CCGCAOOOGCTCACACCAACAOCACCCCCACTCGAICCAOCACTO 

• • • • 
AapLauLauGluLyaTyrMatGlnGlaGlyLyaArgGlaArgCla 
GATCTACTGGAGAAATATATGCAGCAAOGCAAAAOACAGAGAGAG 

1400 . 
GlaArgGluArgProTyrLyaGluVaLThrGluAapLaoLaoEia 
CACAGAOAGAGACCATACAAGOAAGTGACAGAGGACTTACtGCAC 

• • • • 
LauGluGlaGlyGluTarProTyrArgGlaProProTarOluAap 
CTCGAGCAGGGCGAOACACCATACAGGGAGCCACCAACAGAGGAC 

1500 . . • 

La uLauB i aLauAa a8 a rLauPhaClyLy i At pG la 
TTGCTGCACCtCAATTCTCTCTTTGGAAAAOACCAO 
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• Example 6 : Peptide Sequences Encoded By 

The ENV and GAG genes 

The following coding regions for antigenic peptides, identi- 
fied for convenience only by the nucleotide numbers of Example 5, 
within the env and gag gene regions are of particular interest. 



envl (1732-1809) 

Ar g?« lfhrA i *X 1 «0 luty • *ft 
AGAfl TC AC TO C TATAGAC AA« TAG 
• • 

L«uOlBAipGlaAl*ArsL«uAtftt«rtrpGlyCyiAl*rfc«A*t 
CTACAOOACCAOOCOCOOCTAAATTCAtGGCOATOtGCOTTTACA 

• • • • 

CAAGTCTCC 



env2 (1912-1983) 



S«rLy«frtrLatt6WGliAl«41m 
AOTAAAACTTTAOAACACOCACAA 



IltGlnCloCluLT»A»nM«tTTrCluL«uClnLv«L«tti«2t«r 
ATTCAOCAAOAQaIaAATATGTATCAACTACAAAAATTAAATAGC 



I9y0 
Trp 
IGC 



env3 (1482-1530) 



tto Thrl.ytClul.y»AriIyrS«rS«rAl*BUGlyArfEl»ThrArf 
OCT ACAAAACAAAAAACATACTCCTCTGCTCACOGCAGACATACAAOA 
1300 



env4 (55-129) 

CytThrClntyry«lTh»V«lPh«IyrGiy7«lPTO 
TCCACCCAATATCTAACTGTTTTCTATOGCCTACCC 
• • • • 

ThrTrpLy«AaaAlmThrXl«rrel««Pb«Cy«AlAThr 
ACGTGGAAAAATCCAACCATTCCCCTCTTTTGTGCAACC 
100 
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env5 (175-231) 

CATOAt 

TyrClaGli»IUTa*L«uAtaV*lTDfGluAl»?taaA»pAlaTrp 
TAX CAOOAAATAACTTTOAATOTAA CAOAGOC TITTOAf CCATOO 

200 • • 

AsaAtB 
AATAAT 



env6 (274-330) 



«loThrl«rri«Ly»froCy«T«lLy»L«oTbrPraL«oCy» 
GACAGATCAATAAAACCATGTGTCAAACTAACACCTTTAXOT 

• • 300 • 

ValAlattatlytC/a 
CtAGCAATCAAATOC 



env7 (607-660) 



As&Hi«CysA«aThrlarTallla 

AACCATTGCAACACATCAGTCAfC 

ThrGlu8arCy«AaptytHi«TyrTrpAtp 
ACAGAATCATGTCACAAGCACTATTOOOAt 



env8 ( 661-720 ) 



AlaXitArtrhaArg 
GCTAtAAGGTTTAOA 

TyrCytAlaProf roClyTyrAl»L«aL«ttArgCyiA»oA»pThr 
TACTCTGCACCACCGGGTTATGCCCTATTAAGATGTAATGATAGG 
_ 700 



env9 (997-1044) 



LyiArgFroArgOLaAltTrpCyaTrpPhcLyiOlyLyt 
AAAAGACCCAOACAAGCATOGTGCTOOTTCAAAOGGAAA 
1000 . . . 

TrpLysAap 

TGCAAAOAC 

-4 2- - 



envlO (1132-1215) 



lyiClyf«rA»pFroGltt?«lAl4TyrMtcTrpTbrAi* 
AAACGCT6AGACCCAGAACTAGCATACATGTC0ACTAAC 
• • • • 

Cy»ArgCl701ttth«L«uTyrCy»A»aM»tIhrTrfPh«Lt«Ai« 
XGCACAGOAOAGTTTCTCTACTGCAACATCACTTOGTTCCTCAAf 

1200 



envll (1237-1305) 

ArtAsaTytAUFroCycIiaXU 
C 6C AATTATGGACCGTGCCATATA 

Ly«GloIl«Xl«Aa»ftartr»Ii.Ly»V«lClyAr|A«BV*lTyr 
AAGCAAATAATTAACACATGGCATAAGGTAGGOAGAAATGTATAT 

1300 



gagl (991-1053) 

AapCyslyaL«u?aLL«ttI.yftGlyI.«ttGlyMatA»a**oTkrL«m 
GACTGf AAATTAGTGCTAAAAGGACTAGGCATGAACCCZACCTTA 
1000 

Glu0l«MatL«tt7brAl« 
GAAGAGAIGCTGACCOCC 



Of the foregoing peptides, envl, env2, env3 and gagl are 
particularly contemplated for diagnostic purposes, and env4 , 
env5, env6 , env7, env8 , env9, envlO and envll are particularly 
contemplated as protecting agents. These peptides have been se 
lected in part because of their sequence homology to certain o: 
the envelope and gag protein products of other of the ret- 
roviruses in the HIV group. For vaccinating purposes, the fere 
going peptides may be coupled to a carrier protein by utilizing 
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suitable and well known techniques to enhance the host's immune 
respons . Adjuvants such as calcium phosphate or alum hydroxide 
may also be added. The foregoing peptides can be synthesized by 
conventional protein synthesis techniques, such as that of 
Merr if ield. 

It will be apparent to those skilled in the art that various 
modifications and variations can be made in the processes and 
products of the present invention. Thus f it is intended that the 
present application cover the modifications and variations of 
this invention provided they come within the scope of the 
appended claims and their equivalents. For convenience in inter- 
preting the following claims, the following table sets forth the 
correspondence between codon codes and amino acids and the corre- 
spondence between three-letter and one-letter amino acid symbols. 
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DNA COOON AMINO ACID 3 LET • AMINO ACID I LET. 



: 
t 


i 


I \2» T 
* 3\« 


C 


A 


G 


z 

X 


T 


C 


A 


G 


X 

• 
» 


T 


C 


A 


G : 


• 

* 




: T x TTT 


TCT 


TAT 


TGT 


X 


PHE 


SER 


TYR 


CYS 


X 


F 


s 


Y 


C x 


x 


T 


x C x TTC 


TCC 


TAC 


TGC 


• 


PHE 


SCR 


TV B 


VI 1 


• 






w 

y 


C x 


i 




» A l TTA 


TCA 


TAA 


TGA 


f 


LEU 


SER 


♦ + * 


• ** 


1 


L 


s 




♦ x 


x 




i G : TTG 


TCG 


TAG 


TGG 


t 


LEU 


SER 




TRP 


1 


L 






H x 


t 




x T : CTT 


CCT 


CAT 


CCT 


• 


LEU 


PRO 


HIS 


ARC 


X 






H 


R 1 


t 


c 


* C x CTC 


CCC 


CAC 


CGC 




LEU 


PRO 




ARC 

M ^ w 


• 
• 






M 


K X 


* 




x A : CT A 


CCA 


C A A 


CCA 


* 
• 


LEU 


PRO 


GLN 


ARG 


X 




p 


0 


R X 


: 




: G : CTG 


CCG 


CAG 


CCG 


X 


LEU 


PRO 


GLN 


ARG 


X 


L 


p 


0 


R x 


i 




i T : ATT 


ACT 


AAT 


ACT 


• 
• 


ILE 


THR 


ASN 


SER 


X 


I 




N 


S x 


• 


A 


: C : ATC 


ACC 


AAC 


AGC 


: 


ILE 


THR 


ASN 


SER 


X 


I 


J 


ft 


S x 


• 




: A : ATA 


ACA 


AAA 


AGA 


X 


ILE 


THR 


LYS 


ARG 


X 


I 




K 


R t 


• 
* 




: G : ATG 


ACG 


AAG 


AGG x MET 


THR 


LYS 


ARG 


1 


M 




K 


R t 


• 
• 




* T J GTT 


GCT 


GAT 


GGT 


• 


VAL 


ALA 


ASP 


GLY 


X 


V 




0 


G < 


X 


G 


: C » GTC 


GCC 


GAC 


GGC 


x 


VAL 


ALA 


ASP 


CLV 


• 
• 


V 




0 


G I 


• 

* 




t A X GT A 


GCA 


GAA 


GGA 


• 


VAL 


ALA 


GLU 


GLY 


X 


V 




E 


G t 


• 

* 




X C s GTG 


GCG 


GAG 


GGG 


X 


VAL 


ALA 


GLU 


GLY 


s 


V 




E 


G t 



3 Letter 

ALA 

ARG 

ASN 

ASP 

CYS 

GLN 

GLU 

GLY 

HIS 

ILE 

LEU 

LYS 

f*6T 

PHE 

PRO 

SER 

THR 

TRP 

TYR 

VAL 
** • 
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l Letter CODONS 

A GCT GCC GCA GCG 

R CCT CGC CGA CGG AGA AGG 

N AAT AAC 

0 GAT GAC 

C TGT TCC 

0 CAA CAG 
E GAA GAG 

G GGT GGC GGA GGG 

H CAT CAC 

1 ATT ATC ATA 

L CTT CTC CTA CTG TTA TTG 

K AAA AAG 

« ATG 

F TTT TTC 

P CCT CCC CCA CCC 

S TCT TCC TCA TCG ACT AGC 

T ACT ACC ACA ACG 

M TGC 

Y TAT TAC 

V GTT GTC GT* GTG 



TAA TAG TGA 



